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In one embodiment, the nucleotide and polypeptide sequences of the present invention may 
be used to design selective CA inhibitors. Studies have also shown that the different alpha-CA have 
different inhibitor binding properties (Sly et al., (1995), supra), suggesting that it may be possible to 
provide compounds that inhibit a CA isozyme of interest, such as CA II, while not binding to or 
5 inhibiting related enzymes such as the polypeptide of SEQ ID No. 390. The nucleic acid and 
polypeptide sequences of the invention can be used in computer based drug design or for carrying 
out binding predictions with candidate CA inhibitors in view of the extensive structural information 
publicly available for CA enzymes. In preferred embodiments, the nucleic acid and polypeptide of 
the invention is used in drug screening assays, including both cell based and non cell based assays. 

10 In one embodiment, a nucleotide or polypeptide sequence of the invention is brought into contact 
with a candidate CA inhibitor (such as a CA II inhibitor), and binding of the candidate inhibitor to 
the polypeptide of the invention, or the activity of the polypeptide of the invention is detected. 
Activity of the polypeptide of the invention may be CA activity, or any other suitable activity 
possessed by the polypeptide of the invention which may be inhibited by binding of the candidate 

15 substance. In preferred embodiments, a panel of CA isozymes including the polypeptide of the 
invention are screened against the candidate substance, including the polypeptide of SEQ ID No 39 
and one or more enzymes selected from the group consisting of CA I, CA III, CA IV, CA VI, a 
CARP including but not limited to CARP VII, CARP X, CARP XI. In preferred embodiments, a 
candidate CA inhibitor is screened against one or more non-catalytic CA related proteins to 

20 eliminate undesired inhibition of these enzymes which may be involved in other important 

physiological functions. Means to conduct such drug screening assays are well known in the art. 

Increasing alpha-CA activity for the treatment of alpha-CA deficiency disease 

The polypeptide of the invention may also be used as a source of CA activity, such as for 

the treatment of disease. The defects in carbonic anhydrases are the cause of several diseases, 
25 including osteopetrosis (abnormally dense bone) renal tubular acidosis, cerebral calcification and 
mental retardation. Also, a carbonic anhydrase-related protein is described as being linked to cone- 
rod retinal distrophy (Bellinghan et al., 1998, Biochem. Biophys. Res. Comm.: 253, 364-367). 

In one aspect, the invention thus involves increasing C A activity by providing increased 
activity of the polypeptide of SEQ ID No. 390. Increased activity of the polypeptide of SEQ ED No 
30 390 can be provided by any suitable means, as further describer herein. Activity may be provided 
for example by introducing to a host cell or patient a vector containing a nucleotide sequence of 
SEQ ID No 149, treating said cell with a compound capable of increasing the expression of the 
polypeptide of the invention and/or treating a cell or patient directly with a polypeptide of SEQ ED 
No 390. In preferred embodiments, the polypeptide of the invention comprises at least one amino 
35 acid substitution, deletion or insertion. In one aspect, such amino acid changes are preferably in the 
catalytic site; preferably said amino acid changes involve the substitution, deletion or insertion of a 
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His residue and preferably said amino acid changes increase the CO2 hydration activity of the 
polypeptide of the invention. 
Metal ion biosensors 

In further aspects, metal ion biosensors can be designed based on the polypeptide of SEQ 
5 ID No 390. Determination of metal ion concentrations in complex media such as serum, cell 
cytoplasm as well as for example seawater are important analytical functions that require high 
degrees of sensitivity and selectivity. 

Biosensors may be particularly useful in detecting metal ion fluxes in and between cells. 
Such biosensors may exploit metal-binding ability of the polypeptide of the invention, as described 
10 by Thompson et al., who have developed such biosensors based on the CA enzyme (CA II). Such 
biosensors are useful in the detection of metal ion flux for example in the central nervous system. 
Zinc-containing neurons found throughout the mammalian cerebral cortex, striatum and amygdalar 
nuclei have been shown to release their zinc in a depolarization- and calcium-dependent fashion in 
vitro and in vivo. This zinc release has been suggested to act as a trans-synaptic neuromodulator : 
1 5 which has in turn been linked to excitotoxic neuronal cell death. CA based biosensors developed by 
Thomspon et al. showed that zinc is present and can be detected in extracellular medium from 
neurons. (Thompson et al, J. Neurosci Methods 96:35-45 (2000)). 

Biosensors based on CA have been shown to be extremely selective, detecting Cu at 
subpicomolar levels, which is of sensitivity that might be achieved with mass spectometric 
20 techniques. Sensors based on the CA II isozyme have been shown to detect Zn and Cu at picomolar 
levels, and Cd, Co and Ni at nanomolar levels. (Thompson et al., Anal. Biochem. 267:185-195 

(1999) ). CA based biosensors have also demonstrated selectivity over potential interferents in 
biological systems at mM levels in extracellular fluids, such as Mg and Ca. (Thompson et al. 

(2000) , supra). 

25 Biosensors based on the polypeptide of the invention are based on the high selectivity and 

sensitivity of CA isozymes for zinc. Because the binding of Zn in the active site of the enzyme 
affects the enzyme's ability to bind a CA inhibitor, it is possible to use a CA inhibitor that exhibits a 
detectable change upon binding to the polypeptide of the invention to detect the fraction of 
polypeptide bound to the inhibitor, and therefore bound to Zn. The fraction of polypeptide with 

30 bound Zn in turn is determined by the concentrations of free Zn and the polypeptide of the 
invention, and the dissociation constant for zinc. 

In one example, binding of the CA inhibitor to the polypeptide of the invention is detected 
by using a fluorescent inhibitor, whereby the inhibitor shows a detectable change in fluorescence 
emission wavelength of polarization upon binding to the polypeptide of the invention. In one 

35 example, a fluorescent sulfonamide is used, such as the fluorophore ABD-N (Thompson et al. 
(2000), supra). 

Engineered CA enzymes 
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CA isozymes have been shown to have differing levels of catalytic activity and efficiency. 
In preferred embodiments, particularly for treatments which involve providing the increased activity 
of the polypeptide of SEQ ID No 390 or for use in metal ion biosensors, the polypeptide of the 
invention may be modified for increased CO2 hydration and/or zinc binding. 
5 In particular, studies have been carried out characterizing residues important for maximal 

CA activity, allowing CA isozymes to be designed having desired levels of activity. Important 
structural elements in CA isozymes for zinc binding, CO2 hydration activity and stability are 
reviewed in Lindskog, Pharmacol. Ther. 74(1): 1-20 (1997) and Sly (1995), supra. In one example, 
studies of CA III showed that changing the Phel98 residue to a Leu 198 residue (as in CAII) 

10 resulted in a 25 fold increase in activity. (Chen et al., (1993), supra). Catalysis has also been greatly 
increased in CA II by replacing the Thr200 residue with His, as is normally found in CA I enzymes. 
Most dramatically, a CA-related protein (CA-RP) which in its native form was missing important 
residues at the catalytic site and had no detectable C02 hydration activity at all was rendered an 
active CA by only two point mutations. (Sjoblom et al., FEBS Lett. 398: 322-325(1996)). 

1 5 Thus, in embodiments where the polypeptide of the invention is used to provide a source of 

CO2 hydration or for its zinc binding properties, it is advantageous to modify the polypeptide of the 

invention by introducing at least one amino acid substitution, deletion or insertion. In one aspect, 
such amino acid changes are preferably in the catalytic site; preferably said amino acid changes 
involve the substitution, deletion or insertion of a His residue and preferably said amino acid 
20 changes increase the C02 hydration activity of the polypeptide of the invention. Optimal amino 
acid changes can be determined by the skilled artisan, particularly in view of sequence comparisons 
which can be carried out with the many well-characterized CA isozymes. 

Protein of SEP ID NO:252 (internal designation 105-089-3-0-G10-CS) 

The protein of SEQ ID NO:252 is encoded by the cDNA of SEQ ID NO: 1 1 . Accordingly, 

25 it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:252 

described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 105-089-3-0-G10-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ID NO: 1 1 described throughout the present application also pertain 
to the human cDNA of clone 105-089-3-0-G10-CS. It is over represented in fetal brain. 

30 The protein of SEQ ID NO:252 encoded by the cDNA of SEQ ID NO: 1 1 is distributed 

primarily in the prostate and salivary gland. The protein of SEQ ID NO:252 is homologous to 
sequences described in PCT publication WO9827205-A2 (which describes a protein that was 
isolated from a human adult salivary gland cDNA library), PCT publication W09839446-A2, PCT 
publication W09839446-A2. The disclosures of each of the preceding PCT publications is 

35 incorporated herein by reference in their entireties. 
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The protein of SEQ ID NO:252 is also homologous to a polypeptide described in PCT 
publication W09835229-A1, the disclosure of which is incorporated herein by reference in its 
entirety. Wo9835229-Al describes a peptide of 27 amino acid residues that corresponds to 23/27 of 
a portion of the protein of SEQ ED NO:252 (amino acid 20-46). This corresponds to 85% identity 
5 with conserved changes (3 out of 4) yielding a 96% homology. 

The protein described in WO 9835229 was identified in reflex tears that were collected 
from 12 non-contact lens wearing male and female humans. Reflex tears were stimulated by gently 
rubbing the nasal mucosa with a cotton wool tipped bud. Two different batches were collected 
from two different groups and examined by analytical and preparative 2-dimensional 

10 electrophoresis. After separation in the second dimension and transfer to PVDF membranes, 

identified protein spots (by 0.1% (w/v) Coomassie Blue) were loaded into a membrane-compatible 
Hewlett-Packard cartridge. Sequencing was conducted with a Model G1005A (Hewlett-Packard, 
CA) sequenator. One of the proteins identified migrated at 25 kDa and was revealed to have 5 
isoforms of different pi. Two of these were N-terminally sequenced and gave the sequence of the 

15 above peptide with a pi of 5.0 and 4.4. The different isoforms indicate that this protein undergoes 
post-translational modifications, including sialylation or acylation. The presence of these isoforms 
in different degrees could reflect the disease status of the individual. Accordingly, one embodiment 
of the present invention relates to the detection or diagnosis of disease by determining the activity 
or level of the protein of SEQ ED NO:252 or a polynucleotide encoding the protein of SEQ ID 

20 NO:252 in an individual. For example, detection of the secreted protein of SEQ ID NO:252 in an 
individual may be accomplished non-invasively by measuring protein levels in bodily fluids into 
which the protein is secreted, such as tears and saliva. Such methods may be empolyed both in 
humans and in animals. It is probable that after the signal peptide is cleaved, the protein of SEQ ED 
NO:252 is secreted into bodily fluids including tears and probably saliva. 

25 The protein of SEQ ID NO:252 can also be used for the screening of non-ocular diseases, 

by analyzing tears for marker proteins, particularly indicative of cancer and genetic disease. In 
addition, an altered chromatographic profile (e.g. 2D gel) of the isoforms of the protein of SEQ ED 
NO:252 may also indicate the disease state of an individual. For example, the levels of marker 
proteins in relation to the protein of SEQ ED NO:252 may be determined to evaluate whether the 

30 individual is suffering from a disease. Alternatively, tears may be analyzed for the levels of 
different isoforms of the protein of SEQ ED NO:252 to determine whether the pattern of such 
isoforms is indicative of disease. 

The protein of SEQ ED NO:252 or fragments thereof may also be used as a lubricant or 
cleansing agent for the eyes. This protein can be included in contact lenses washing and storage 

35 solutions. This protein can also be useful as an ingredient in eye washing solutions (e.g. eye drops) 
used for everyday redness or healing after surgical/laser intervention. For example, the protein may 
be used to reduce eye inflammation. Alternatively, anti-bacterial properties may be exploited by 
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including the protein of SEQ ID NO:252 or fragments thereof in solutions, creams or ointments for 
the eyes, as well as creams or ointments in general for external applications. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO: 252, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
5 consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

ameliorate a condition in an individual. In such embodiments, the protein of SEQ ID NO:252, or a 
fragment thereof, is administered to an individual in whom it is desired to increase or decrease any 
of the activities of the protein of SEQ ID NO:252. The protein of SEQ ID NO:252 or fragment 
thereof may be administered directly to the individual or, alternatively, a nucleic acid encoding the 

10 protein of SEQ ID NO:252 or a fragment thereof may be administered to the individual. 

Alternatively, an agent which increases the activity of the protein of SEQ ID NO:252 may be 
administered to the individual. Such agents may be identified by contacting the protein of SEQ ID 
NO:252 or a cell or preparation containing the protein of SEQ ID NO:252 with a test agent and 
assaying whether the test agent increases the activity of the protein. For example, the test agent 

15 may be a chemical compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:252 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:252 may be identified by contacting the protein of 
SEQ ID NO:252 or a cell or preparation containing the protein of SEQ ID NO:252 with a test agent 

20 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 

25 example, saliva or tears, or to distinguish between two or more possible sources of a sample on the 
basis of the level of the protein of SEQ ID NO:252 in the sample. For example, the protein of SEQ 
ID NO:252 or fragments thereof may be used to generate antibodies using any techniques known to 
those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue that 

30 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from saliva or tears or tissues other than saliva or tears to determine whether the test sample is from 

35 saliva or tears. Alternatively, the level of the protein of SEQ ID NO:252 in a test sample may be 
measured by determining the level of RNA encoding the protein of SEQ ID NO:252 in the test 
sample. RNA levels may be measured using nucleic acid arrays or using techniques such as in situ 
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hybridization, Northern blots, dot blots or other technques familiar to those skilled in the art. If 
desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic acid 
sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in control 
cells from saliva or tears or tissues other than saliva or tears to determine whether the test sample is 
5 from saliva or tears. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:252, 
including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:252 or a fragment thereof may be fixed to a solid support, such as a 

1 0 chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:252 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:252 or a 

15 fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:252. In such techniques, the level of the protein of SEQ ID NO:252 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of 252 in the ill individual is compared to the level in normal individuals to determine 
whether the individual has a level of the protein of SEQ ID NO:252 which is indicative of disease. 

20 Protein of SEP ID NO:308 (internal designation 1 87-41 -0-0-i21-CS) 

The protein of SEQ ID NO:308 is encoded by the cDNA of SEQ ID NO:67. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:308 
described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 187-41-0-0-i21-CS. In addition, it will be appreciated that all characteristics and 

25 uses of the nucleic acid of SEQ ID NO:67 described throughout the present application also pertain 
to the human cDNA of clone 187-41-0-0-i21-CS. 

The protein of SEQ ID NO:308 is highly homologous to human secreted protein nf87_l 
from PCT publication WO 9935252-A2 (the disclsoure of which is incorporated herein by reference 
in its entirety), to amino acids 26-129 of the human secreted protein SEQ ID NO:441 from PCT 

30 publication WO 9906548-A2 (the disclosure of which is incorporated herein by reference in its 
entirety), and to amino acids 26-1 14 of human secreted protein SEQ ID NO:439 from PCT 
publication WO 9906548-A2, the disclosure of which is incorporated herein by reference in its 
entirety. Thus, the protein of the invention appears to be a polymorphic variant of nf87_ 1. Since 
most of the proteins with high homology to the sequence of the invention have longer 5 'termini, it 

35 is conceivable that the protein of the invention is a truncated/spliced variant of these proteins. 
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The protein of SEQ ID NO:308 was identified among the cDNAs from a library constructed 
from brain. Tissue distribution analysis through a BLAST analysis of databases shows that mRNA 
encoding this protein was found primarily in kidney, liver, and cancerous prostate. 

The protein of SEQ ID NO:308 has chemical and structural homology to human interferon- 
5 inducible (IFI) protein isoforms p27 (63%), HIFI (50% identity), and to interferon-induced protein 
6-16 precursor (IFI-6-16, 36%). Furthermore, the protein of the invention has structural homology 
(40% identity) to the human erythropoietin (EPO) primary response gene, EPRG3pt from PCT 
publication WO 9906063-A2, the disclosure of which is incorporated herein by reference in its 
entirety. Thus, the present invention relates to nucleic acid and amino acid sequences of a novel IFI 
10 protein and to the use of these sequences in the diagnosis, study, prevention and treatment of 
disease. 

The protein of SEQ ID NO:308 comprises 105 amino acids. From the amino acid 
alignments and the hydrophobicity plots, it has a predicted signal peptide sequence spanning 
residues 31-43 and two predicted transmembrane domains spanning residues 17-37, and 48-68. 

15 Accordingly, one embodiment of the present invention is a polypeptide comprising the signal 
peptide and/or one or more of the transmembrane doamins. 

Interferons (IFNs) are a part of the group of intercellular messenger proteins known as 
cytokines. a-IFN is the product of a multigene family of at least 16 members, whereas b-IFN is the 
product of a single gene, a- and P-IFNs are also known as type I IFNs. Type I IFNs are produced in 

20 a variety of cells types. Biosynthesis of type I IFNs is stimulated by viruses and other pathogens, 
and by various cytokines and growth factors. y-IFN, also known as type II IFN, is produced in T- 
cells and natural killer cells. Antigens to which the organism has been sensitized stimulate 
biosynthesis of type II IFN. Both a- and y-IFNs are immunomodulators and anti-inflammatory 
agents, activating macrophages, T-cells and natural killer cells. 

25 IFNs are part of the body's natural defense to viruses and tumors. They exert these defenses 

by affecting the function of the immune system and by direct action on pathogens and tumor cells. 
IFNs mediate these multiple effects in part by inducing the synthesis of many cellular proteins. 
Some interferon-inducible (IFI) genes are induced equally well by a-, P- and y-IFNs. Other IFI 
genes are preferentially induced by the type I or by the type II IFNs. The various proteins produced 

30 by IFI genes possess antitumor, antiviral and immunomodulatory functions. The expression of 
tumor antigens in cancer cells is increased by a-IFN, and renders the cancer cells more susceptible 
to immune rejection. The IFI proteins synthesized in response to viral infections are known to 
inhibit viral functions such as cell penetration, uncoating, RNA and protein synthesis, assembly and 
release (Hardman JG et al 25 (1996) The Pharmacological Basis of Therapeutics, McGraw-Hill, 

35 New York NY pp 121 1-1214, the disclosure of which is incorporated herein by reference in its 
entirety). Type II IFN stimulates expression of major histocompatibility complex (MHC) proteins 
and is thus used in immune response enhancement. 
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The IFI gene known as 6-16 encodes an mRNA, which is highly induced by type I IFNs in a 
variety of human cells (Kelly JM et al (1986) EMBO J 5:1601-1606, the disclosure of which is 
incorporated herein by reference in its entirety). After induction, 6-16 mRNA constitutes as much 
as 0.1% of the total cellular mRNA. The 6-16 mRNA is present at only very low levels in the 
5 absence of type I IFN, and is only weakly induced by type II IFN. The 6-16 mRNA encodes a 
hydrophobic protein of 130 amino acids. The first 20 to 23 amino acids comprise a putative signal 
peptide. Protein 6-16 has at least two predicted transmembrane regions culminating in a negatively 
charged C-terminus. 

The p27 gene encodes a protein with 41% amino acid sequence identity to the 6-16 protein. 

10 The p27 gene is expressed in some breast tumor cell lines and in a gastric cancer cell line. In other 
breast tumor cell lines, in the HeLa cervical cancer cell line, and in fetal lung fibroblasts, p27 
expression occurs only upon a-IFN induction. In one breast tumor cell line, p27 is independently 
induced by estradiol and by IFN (Rasmussen UB et al (1993) Cancer Res 53:4096-4101, the 
disclosure of which is incorporated herein by reference in its entirety). Expression of p27 was 

15 analyzed in 21 primary invasive breast carcinomas, 1 breast cancer bone metastasis, and 3 breast 
fibroadenomas. High levels of p27 were found in about one-half of the primary carcinomas and in 
the bone metastasis, but not in the fibroadenomas. These observations suggest that certain breast 
tumors may produce high levels of, or have increased sensitivity to, type I IFN as compared to other 
breast tumors (Rasmussen UB et al, supra). In addition, the p27 gene expressed at significant levels 

20 in normal tissues including colon, stomach and lung, but not expressed in placenta, kidney, liver or 
skin. (Rasmussen UB et al, supra). 

The small hydrophobic IFI gene products may contribute to viral resistance. A hepatitis-C 
virus (HCV)-induced gene, 130-51, was isolated from a cDNA library prepared from chimpanzee 
liver during the acute phase of the infection. The protein product of this gene has 97% identity to 

25 the human 6-16 protein (Kato T et al (1992) Virology 190:856-860, the disclosure of which is 
incorporated herein by reference in its entirety). The authors of the preceding paper suggest that 
HCV infection actively induces IFN expression, which in turn induces expression of IFI genes 
including 130-5 1 . The IFI proteins synthesized in response to viral infections are known to inhibit 
viral functions such as penetration, uncoating, RNA or protein synthesis, assembly or release. The 

30 1 30-5 1 protein may inhibit one or more of these functions in HCV. A particular virus may be 

inhibited in multiple functions by IFI proteins. In addition, the principle inhibitory effect exerted by 
IFI proteins differs among the virus families (Hardman JG, supra, p 121 1, the disclosure of which is 
incorporated herein by reference). 

The HIFI protein (PCT publication WO 9812223-A2, the disclosure of which is 

35 incorporated herein by reference in its entirety) is a human sequence identified among cDNAs from 
a library constructed from human neonatal kidney. Northern blot analysis using LIFESEQ™ 
database (Incyte Pharmaceuticas, Palo Alto, CA) shows that HIFI mRNA was found only in 
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neonatal kidney. The HIFI protein consists of 104 amino acids and has 55%, 45%, and 46% amino 
acid sequence identity to p27, 6-16 and 130-51, respectively. 

Based on the chemical and structural homology between the protein of SEQ ID NO:308 and 
the small hydrophobic IFI proteins from human and chimpanzee, it is believed that the protein of 
5 SEQ ID NO:308 is synthesized when interferons are produced in infections, inflammation, 
autoimmune diseases etc. Interferons are produced in response to various cytokines and growth 
factors, in viral infections, inflammation, autoimmune diseases, and cancers. Accordingly, the 
protein of SEQ ID NO:308 or fragments thereof may be used in diagnosis and treatment of diseases 
such as, but not limited to, autoimmune disorders such as rheumatoid arthritis, Graves disease, 

10 systemic lupus erythematosus, autoimmune hepatitis, Wegener's granulomatosis, sarcoidosis, 
polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's syndrome, inflammatory 
bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, Type I diabetes, insulin- 
dependent diabetes mellitus, Lupus Nephritis, and allergic encephalomyelitis; proliferative disorders 
including various forms of cancer such as leukemias, lymphomas (Hodgkins and non-Hodgkins), 

15 sarcomas, melanomas, adenomas, carcinomas of solid tissue, hypoxic tumors, squamous cell 
carcinomas of the mouth, throat, larynx, and lung, genitourinary cancers such as cervical and 
bladder cancer, hematopoietic cancers, head and neck cancers, and nervous system cancers, benign 
lesions such as papillomas, atherosclerosis, angiogenesis; viral infections, in particular HCV and 
HIV infections, as well as other pathogen-induced infections (e.g. leishmania). 

20 The protein of SEQ ID NO:308 or fragments thereof may also be used to treat conditions 

associated with inflammation or immune impairment (e. g. reumathoid and osteo arthritis and 
AIDS). 

Another embodiment of the present invention relates to the use of the protein of SEQ ID 
NO:308 or fragments thereof to treat and/or prevent the ill-effect of bacterial infection during 

25 pregnancy in mammals, such as spontaneous abortion and maternal death. In a preferred 

embodiment, the protein of the invention may be used to counteract the effects of the bacterial 
endotoxin lipopoly saccharide (LPS). The methods for using such compositions is described in 
Dziegielewska and Andersen, Biol. Neonate, 74:372-5 (1998), the disclosure of which is 
incorporated herein by reference in its entirety. 

30 Furthermore, the protein of SEQ ID NO:308 or fragments thereof are useful as a reagent for 

analyzing the control of gene expression by interferons and other cytokines in both normal and 
diseased cells. The protein of the SEQ ID NO:308 or fragments thereof may be used to identify 
specific molecules with which it binds such as agonists, antagonists or inhibitors. 

Another embodiment of the present invention relates to methods of using the protein of 

35 SEQ ID NO:308 or fragments thereof to identify and/or quantify cytokines of the interferon family 
as well as other cytokines such as IL10 and tumor antigens, which may interact with the protein of 
the invention. 



308 



WO 01/42451 PCT/IB00/01938 

The protein of SEQ ID NO: 308 or fragments thereof may also be included in 
pharmaceutical preparations for treating cancer or prevention/treatment of other diseases associated 
with changes in expression of the protein of the invention. In another embodiment of the present 
invention, the protein of SEQ ID NO 308 or fragments thereof is used to inhibit and/or modulate the 
5 effect of cytokines and related molecule such as 11-2, TNF alpha, CTLA4, CD28, and others, by 
preventing the binding of the endogenous cytokine to their natural receptors, thereby blocking cell 
proliferation or inhibitory signals generated by the ligand-receptor binding event. 

The protein of SEQ ID NO:308 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation and tumor models, by injecting the protein 

10 either intra peritoneally intravenously, subcutaneously or directly in the diseased tissue. 

The DNA encoding the protein of SEQ ID NO:308 or fragments thereof is useful in 
diagnostic assays for conditions/diseases associated with expression of the protein of the invention. 
The diagnostic assay is useful to distinguish between absence, presence, and excess expression of 
the protein of the invention and to monitor regulation of levels of the protein of the invention during 

15 therapeutic intervention. The DNA may also be incorporated into effective eukaryotic expression 
vectors and directly targeted to a specific tissue, organ, or cell population for use in gene therapy to 
treat the above mentioned conditions, including tumors and/or to correct disease- or genetic-induced 
defects in any of the above mentioned proteins including the protein of the invention. The DNA 
may also be used to design antisense sequences and ribozymes, which can be administered to 

20 modify gene expression in tumor and pathogen-infected cells and to influence expression of 

cytokines and growth factors. In vivo delivery of genetic constructs into subjects can be developed 
to the point of targeting specific cell types, such as tumor where expression of the protein of the 
invention may be affected or is modulating the expression and/or activity of other proteins such as 
cytokines, growth factors, their receptors and/or tumor antigens. It is also useful to detect unknown 

25 upstream sequences (e. g. promoters and regulatory elements) by standard techniques and for 
research into the control of gene expression by interferons and other cytokines, as well as growth 
and transcription factors in normal and diseased cells. Hybridization probes are useful to detect 
DNA encoding the protein of the invention (or closely related molecules) in biological samples, and 
for mapping the naturally occurring genomic sequence to a particular chromosome/chromosome 

30 region. The DNA may be used to generate and/or treat in vivo animal models of disease, including 
susceptibility or resistance to infection, inflammation, tumors and autoimmune conditions, as well 
as tumor therapy, based on vaccine, knock-out and transgene technologies. 

Antibodies against the protein of SEQ ID NO:308 or fragments thereof are useful for the 
diagnosis of conditions and diseases associated with its expression and to quantify the protein of the 

35 invention (e. g. in assays to monitor patients during therapeutic intervention). Antibodies specific 
for the protein may include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 
Fab fragments produced by a Fab expression library. Neutralizing antibodies are especially 
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preferred for diagnostics and therapeutics. Diagnostic assays for the protein of the invention include 
methods utilizing the antibody and a label to detect the protein of the invention in human body 
fluids or extracts of cells or tissues. 

The protein of the invention and its catalytic or immunogenic fragments or oligopeptides 
5 thereof, can be used for screening therapeutic compounds in any variety of drug screening 

techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 
(PCR), RT-PCR, RNAse protection, Northern and western blotting, enzyme-linked immunosorbent 
asay (ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 

10 immunoprecipitation, and chromatography. 

Under conditions of significant blood loss, EPO therapy, or both, iron-restricted 
erythropoiesis is evident. However, intravenous or oral iron therapy has substantial drawbacks. 
Moreover, traditional biochemical markers of storage iron in patients with anemia of chronic 
disease are unhelpful in the assessment of iron status (Lawrence T et al (2000) Blood 96:823-833, 

15 the disclosure of which is incorporated herein by reference in its entirety). As the protein of SEQ 
ID NO:308 bears homology to the human erythropoietin (EPO) primary response gene, EPRG3pt, it 
may be used to promote red blood cell formation or to monitor the value of safer intravenous iron 
preparations in patients with blood loss anemia, particularly those undergoing EPO therapy. 

The hydrophobic IFI protein of SEQ ID NO:308 or fragments thereof may be used to 

20 diagnose conditions associated with its induction. For example, the protein of SEQ ID NO:308 or 
fragments thereof may be useful in the diagnosis and treatment of tumors, viral infections, 
inflammation, or conditions associated with impaired immunity, anemia of chronic blood loss or 
chronic disease, hemochromatosis, and EPO therapy. Furthermore, this protein may be used for 
investigating the control of gene expression by IFNs and other cytokines, as well as hormones and 

25 growth factors, in normal and diseased cells. 

The protein of SEQ ID NO:308 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation, anemia, iron-overload and tumor models, by 
injecting the protein either intra peritoneally intravenously, subcutaneously or directly in the 
diseased tissue. 

30 In addition, the protein of SEQ ID NO:308 is structurally related to other proteins having 

homology and/or structural similarity with human p27 (Rasmussen, U.B., et al., 1993, Cancer 
Research 53:4096-4101, the disclosure of which is incorporated herein by reference). Accordingly, 
the protein of brain, fetal brain, kidney, fetal kidney, or colon may be used to regulate the 
proliferation of EPO-dependent cells or the growth and development of erythroid and other 

35 hematopoietic lineages. 

The protein of SEQ ID NO:308 or fragments thereof, or polynucleotides encoding the 
protein of SEQ ID NO:308 or fragments thereof, may be used to treat or ameliorate anemia of 
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chronic disease and chronic renal failure, polycythemia, cancer, AIDS, drug- and phlebotomy- 
induced anemias, hemochromatosis, erythropoiesis mediated by EPO therapy, and other conditions 
associated with altered activity or levels of the protein of SEQ ID NO:308. 

In another embodiment, the present invention relates to methods for identifying agonists 
5 and antagonists/inhibitors using the protein of SEQ ID NO.308 or fragments thereof, and treating 
conditions with the identified compounds. In a still further aspect, the invention relates to 
diagnostic assays for detecting diseases associated with inappropriate levels or activity of the 
protein of SEQ ID NO:308. In still another embodiment of the invention relates to the use of the 
protein SEQ ID NO:308, fragments therof or the DNA encoding the protein of SEQ ID NO:308 or 

10 fragments thereof to monitor the value of iron therapy in patients undergoing EPO therapy, or 
experiencing blood loss, or both. 

The DNA encoding the protein of SEQ ID NO:308 or fragments thereof is useful in 
diagnostic assays for conditions/diseases associated with abnormal expression of the protein of SEQ 
ID NO:308. The diagnostic assay is useful to distinguish between absence, presence, and excess 

15 expression of the protein of the invention and to monitor regulation of levels of the protein of the 
invention during therapeutic intervention. The DNA may also be incorporated into effective 
eukaryotic expression vectors and directly targeted to a specific tissue, organ, or cell population for 
use in gene therapy to treat the above mentioned conditions, including tumors and/or to correct 
disease- or genetic -induced defects in any of the above mentioned proteins including the protein of 

20 the invention. The DNA may also be used to design antisense sequences and ribozymes, which can 
be administered to modify gene expression in tumor and pathogen-infected cells and to influence 
expression of cytokines, hormones and growth factors. In vivo delivery of genetic constructs into 
subjects can be developed to the point of targeting specific cell types, such as tumor where 
expression of the protein of the invention may be affected or is modulating the expression and/or 

25 activity of other proteins such as cytokines, growth factors, their receptors and/or tumor antigens. It 
is also useful to detect unknown upstream sequences (e. g. promoters and regulatory elements) by 
standard techniques and for research into the control of gene expression by interferons and other 
cytokines, as well as growth and transcription factors in normal and diseased cells. Hybridization 
probes are useful to detect DNA encoding the protein of the invention (or closely related molecules) 

30 in biological samples, and for mapping the naturally occurring genomic sequence to a particular 
chromosome/chromosome region. The DNA may be used to generate and/or treat in vivo animal 
models of disease, including susceptibility or resistance to infection, tumors, autoimmune 
conditions, anemia and iron-overload, as well as tumor therapy, based on vaccine, knock-out and 
transgene technologies. 

35 Antibodies against the protein of SEQ ID NO:308 are useful for the diagnosis of conditions 

and disease associated with its expression and to quantify the protein of the invention (e. g. in 
assays to monitor patients during therapeutic intervention). Antibodies specific for the protein may 
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include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab fragments 
produced by a Fab expression library. Neutralizing antibodies are especially preferred for 
diagnostics and therapeutics. Diagnostic assays for the protein of SEQ ID NO:308 include methods 
utilizing the antibody and a label to detect the protein of the invention in human body fluids or 
5 extracts of cells or tissues. 

The protein of SEQ ID NO:308 and its catalytic or immunogenic fragments or oligopeptides 
thereof, can be used for screening therapeutic compounds in any variety of drug screening 
techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 

10 (PCR), RT-PCR, RNAse protection, Northern blotting, enzyme-linked immunosorbent asay 
(ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 
immunoprecipitation, and chromatography. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO: 308, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 

1 5 consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

ameliorate a condition in an individual. For example, the condition may be cancer, including breast 
cancer, viral infection, bacterial infection, inflammation, autoimmune disorders, rheumatoid 
arthritis, Graves disease, systemic lupus erythematosus, autoimmune hepatitis, Wegener's 
granulomatosis, sarcoidosis, polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's 

20 syndrome, inflammatory bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, 
Type I diabetes, insulin-dependent diabetes mellitus, Lupus Nephritis, and allergic 
encephalomyelitis; proliferative disorders including various forms of cancer such as leukemias, 
lymphomas (Hodgkins and non-Hodgkins), sarcomas, melanomas, adenomas, carcinomas of solid 
tissue, hypoxic tumors, squamous cell carcinomas of the mouth, throat, larynx, and lung, 

25 genitourinary cancers such as cervical and bladder cancer, hematopoietic cancers, head and neck 
cancers, and nervous system cancers, benign lesions such as papillomas, atherosclerosis, 
angiogenesis; viral infections, in particular HCV and HIV infections, as well as other pathogen- 
induced infections (e. g. leishmania). 

In such embodiments, the protein of SEQ ID NO:308 , or a fragment thereof, is 

30 administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ID NO:308. The protein of SEQ ID NO:308 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ID NO:308 or a fragment thereof may be administered to the individual. Alternatively, an agent 
which increases the activity of the protein of SEQ ID NO: 308 may be administered to the 

35 individual. Such agents may be identified by contacting the protein of SEQ ID NO:308 or a cell or 
preparation containing the protein of SEQ ID NO:308 with a test agent and assaying whether the 
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test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:308 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
5 with the activity of the protein of SEQ ID NO:308 may be identified by contacting the protein of 
SEQ ID NO: 308 or a cell or preparation containing the protein of SEQ ID NO:308 with a test 
agent and assaying whether the test agent decreases the activity of the protein. For example, the 
agent may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as 
an antisense nucleic acid or a triple helix-forming nucleic acid. 

10 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, kidney, liver, or cancerous prostate, or to distinguish between two or more possible 
sources of a sample on the basis of the level of the protein of SEQ ID NO.308 in the sample. For 
example, the protein of SEQ ID NO:308 or fragments thereof may be used to generate antibodies 

15 using any techniques known to those skilled in the art, including those described therein. Such 
antibodies may then be used to identify tissues of unknown origin, for example, forensic samples, 
differentiated tumor tissue that has metastasized to foreign bodily sites, or to differentiate different 
tissue types in a tissue cross-section using immunochemistry. In such methods a sample is 
contacted with the antibody, which may be detectably labeled, under conditions which facilitate 

20 antibody binding. The level of antibody binding to the test sample is measured and compared to the 
level of binding to control cells frombrain, kidney, liver, or cancerous prostate or tissues other than 
brain, kidney, liver, or cancerous prostate to determine whether the test sample is from brain, 
kidney, liver, or cancerous prostate. Alternatively, the level of the protein of SEQ ID NO.308 in a 
test sample may be measured by determining the level of RNA encoding the protein of SEQ ID 

25 NO:308 in the test sample. RNA levels may be measured using nucleic acid arrays or using 
techniques such as in situ hybridization, Northern blots, dot blots or other technques familiar to 
those skilled in the art. If desired, an amplification reaction, such as a PCR reaction, may be 
performed on the nucleic acid sample prior to analysis. The level of RNA in the test sample is 
compared to RNA levels in control cells from brain, kidney, liver, or cancerous prostate or tissues 

30 other than brain, kidney, liver, or cancerous prostate to determine whether the test sample is from 
brain, kidney, liver, or cancerous prostate. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:308, 
including using methods known to those skilled in the art. For example, an antibody against the 

35 protein of SEQ ID NO:308 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:308 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:308 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ID NO:308. In such techniques, the level of the protein of SEQ ID NO:308 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of SEQ ED NO:308 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO:308 which is associated 
with disease. 

10 Protein of SEP ID NOs:289 and 307 (internal designations 175-1 -3-0-E5-CS.cor and 187-39-0-0- 
k!2-CS) 

The protein of SEQ ID NO:289 is encoded by the cDNA of SEQ ID NO:48. Accordingly, 

it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:289 

described throughout the present application also pertain to the polypeptide encoded by the human 
15 cDNA of clone 175-1-3-0-E5-CS. In addition, it will be appreciated that all characteristics and uses 

of the nucleic acid of SEQ ID NO:48 described throughout the present application also pertain to 

the human cDNA of clone 175-1-3-0-E5-CS. 

The protein of the invention consists of 130 amino acids. From the amino acid alignments 

and the hydrophobicity plots, it has a predicted signal peptide sequence spanning residues 8-20 and 
20 four predicted transmembrane domains spanning residues 2-24, 42-61, 70-90 and 99-1 19. 

Accordingly, some embodiments of the present invention relate to polypeptides comprising the 

signal peptide and/or one or more of the transmembrane domains. 

The protein of SEQ ID NO:289 encoded by the cDNA of SEQ ID NO:48 is homologous to 

SEQ ID NO: 4199 from EP 1 03340 1-A2 (the disclosure of which is incorporated herein by 
25 reference in its entirety), a human secreted protein. Another protein, SEQ ID NO:307, encoded by 

the cDNA of SEQ ID NO:66, is a polymorphic variant of the protein of SEQ ID NO:289, and shares 

all of the herein-described functions and uses. 

The present invention relates to a novel protein identified among the cDNAs from a library 

constructed from salivary gland, and to the use of the nucleic acid and amino acid sequences 
30 disclosed herein in the study, diagnosis, prevention, and treatment of disease. Tissue distribution 

analysis predicted by BLAST on databases shows that mRNA encoding this protein was found 

primarily in brain and fetal brain, with lower amounts in kidney, fetal kidney and colon. 

Interferons (IFNs) are a part of the group of intercellular messenger proteins known as 

cytokines. a-IFN is the product of a multigene family of at least 16 members, whereas b-IFN is the 
35 product of a single gene, a- and (3-IFNs are also known as type I IFNs. Type I IFNs are produced in 

a variety of cells types. Biosynthesis of type I IFNs is stimulated by viruses and other pathogens, 
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and by various cytokines and growth factors. y-IFN, also known as type II IFN, is produced in T- 
cells and natural killer cells. Antigens to which the organism has been sensitized stimulate 
biosynthesis of type II IFN. Both a- and y-IFNs are immunomodulators and anti-inflammatory 
agents, activating macrophages, T-cells and natural killer cells. 
5 IFNs are part of the body's natural defense to viruses and tumors. They exert these defenses 

by affecting the function of the immune system and by direct action on pathogens and tumor cells. 
IFNs mediate these multiple effects in part by inducing the synthesis of many cellular proteins. 
Some interferon-inducible (IFI) genes are induced equally well by a-, (3- and y-IFNs. Other IFI 
genes are preferentially induced by the type I or by the type II IFNs. The various proteins produced 

10 by IFI genes possess antitumor, antiviral and immunomodulatory functions. The expression of 
tumor antigens in cancer cells is increased by a-IFN, and renders the cancer cells more susceptible 
to immune rejection. The IFI proteins synthesized in response to viral infections are known to 
inhibit viral functions such as cell penetration, uncoating, RNA and protein synthesis, assembly and 
release (Hardman JG et al 25 (1996) The Pharmacological Basis of Therapeutics, McGraw-Hill, 

15 New York NY pp 121 1-1214, the disclosure of which is incorporated herein by reference in its 
entirety). Type II IFN stimulates expression of major histocompatibility complex (MHC) proteins 
and is thus used in immune response enhancement. 

The protein of SEQ ID NO:289 is a small hydrophobic protein having chemical and 
structural homology to human interferon-inducible (IFI) protein isoforms 6-16 (97% identity), HIFI 

20 (44%), and p27 (33%), as well as 130-51, the chimpanzee homolog of 6-16 - (97%). Thus, the 
protein of SEQ ID NO:289 and the nucleic acid encoding it are polymorphic variants of 6-16 or the 
gene encoding 6-16. The protein of SEQ ID NO:289, fragments thereof, or nucleic acids encoding 
the protein of SEQ ID NO:289 or fragments thereof may be used in the diagnosis, study, prevention 
and treatment of disease as described below. 

25 The IFI gene known as 6-1 6 encodes an mRNA, which is highly induced by type I IFNs in a 

variety of human cells (Kelly JM et al (1986) EMBO J 5:1601-1606, the disclosure of which is 
incorporated herein by reference in its entirety). After induction, 6-16 mRNA constitutes as much as 
0.1% of the total cellular mRNA. The 6-16 mRNA is present at only very low levels in the absence 
of type I IFN, and is only weakly induced by type II IFN. The 6-16 mRNA encodes a hydrophobic 

30 protein of 130 amino acids. The first 20 to 23 amino acids comprise a putative signal peptide. 

Protein 6-16 has at least two predicted transmembrane regions culminating in a negatively charged 
C-terminus. 

The p27 gene encodes a protein with 41% amino acid sequence identity to the 6-16 protein. 
The p27 gene is expressed in some breast tumor cell lines and in a gastric cancer cell line. In other 
35 breast tumor cell lines, in the HeLa cervical cancer cell line, and in fetal lung fibroblasts, p27 

expression occurs only upon a-IFN induction. In one breast tumor cell line, p27 is independently 
induced by estradiol and by IFN (Rasmussen UB et al (1993) Cancer Res 53:4096-4101, the 
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disclosure of which is incorporated herein by reference in its entirety). Expression of p27 was 
analyzed in 21 primary invasive breast carcinomas, 1 breast cancer bone metastasis, and 3 breast 
fibroadenomas. High levels of p27 were found in about one-half of the primary carcinomas and in 
the bone metastasis, but not in the fibroadenomas. These observations suggest that certain breast 
5 tumors may produce high levels of, or have increased sensitivity to, type I IFN as compared to other 
breast tumors (Rasmussen UB et al, supra). In addition, the p27 gene expressed at significant levels 
in normal tissues including colon, stomach and lung, but not expressed in placenta, kidney, liver or 
skin. (Rasmussen UB et al, supra). 

The small hydrophobic IFI gene products may contribute to viral resistance. A hepatitis-C 

10 virus (HCV)-induced gene, 130-51, was isolated from a cDNA library prepared from chimpanzee 
liver during the acute phase of the infection. The protein product of this gene has 97% identity to 
the human 6-16 protein (Kato T et al (1992) Virology 190:856-860, the disclosure of which is 
incorporated herein by reference in its entirety). The authors of this paper suggest that HCV 
infection actively induces IFN expression, which in turn induces expression of IFI genes including 

15 130-51. The IFI proteins synthesized in response to viral infections are known to inhibit viral 

functions such as penetration, uncoating, RNA or protein synthesis, assembly or release. The 130- 
5 1 protein may inhibit one or more of these functions in HCV. A particular virus may be inhibited 
in multiple functions by IFI proteins. In addition, the principle inhibitory effect exerted by IFI 
proteins differs among the virus families (Hardman JG, supra, p 1211, the disclosure of which is 

20 incorporated herein by reference). 

The HIFI protein (PCT publication WO 9812223-A2, the disclosure of which is 
incorporated herein by reference in its entirety) is a human sequence identified among cDNAs from 
a library constructed from human neonatal kidney. Northern blot analysis using LIFESEQ™ 
database (Incyte Pharmaceuticas, Palo Alto, CA) shows that HIFI mRNA was found only in 

25 neonatal kidney. The HIFI protein consists of 104 amino acids and has 55%, 45%, and 46% amino 
acid sequence identity to p27, 6-16 and 130-51, respectively. 

The hydrophobic IFI proteins of the invention may provide the basis for clinical diagnosis 
of diseases associated with their induction. These proteins may be useful in the diagnosis and 
treatment of tumors, viral infections, inflammation, or conditions associated with impaired 

30 immunity. Furthermore, these proteins may be used for investigations of the control of gene 
expression by IFNs and other cytokines in normal and diseased cells. 

Based on the chemical and structural homology among the protein of SEQ ID NO:289 and 
the small hydrophobic IFI proteins from human and chimpanzee, it is believed that the protein of 
SEQ ID NO:289 is synthesized when interferons are produced in infections, inflammation, 

35 autoimmune diseases etc. Interferons are produced in response to various cytokines and growth 
factors, in viral infections, inflammation, autoimmune diseases, and cancers. Accordingly, the 
protein of SEQ ID NO:289 or fragments thereof may be used in diagnosis and treatment of diseases 
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such as, but not limited to, autoimmune disorders such as rheumatoid arthritis, Graves disease, 
systemic lupus erythematosus, autoimmune hepatitis, Wegener's granulomatosis, sarcoidosis, 
polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's syndrome, inflammatory 
bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, Type I diabetes, insulin- 
5 dependent diabetes mellitus, Lupus Nephritis, and allergic encephalomyelitis; proliferative disorders 
including various forms of cancer such as leukemias, lymphomas (Hodgkins and non-Hodgkins), 
sarcomas, melanomas, adenomas, carcinomas of solid tissue, hypoxic tumors, squamous cell 
carcinomas of the mouth, throat, larynx, and lung, genitourinary cancers such as cervical and 
bladder cancer, hematopoietic cancers, head and neck cancers, and nervous system cancers, benign 
10 lesions such as papillomas, atherosclerosis, angiogenesis; viral infections, in particular HCV and 
HIV infections, as well as other pathogen-induced infections (e. g. leishmania). 

The protein of SEQ ID NO:289 or fragments thereof may also be used to treat conditions 
associated with inflammation or immune impairment (e. g. reumathoid and osteo arthritis and 
AIDS). 

15 Another embodiment of the present invention relates to the use of the protein of SEQ ID 

NO:289 or fragments thereof to treat and/or prevent the ill-effect of bacterial infection during 
pregnancy in mammals, such as spontaneous abortion and maternal death. In a preferred 
embodiment, the protein of the invention may be used to counteract the effects of the bacterial 
endotoxin lipopolysaccharide (LPS). The methods for using such compositions is described in 

20 Dziegielewska and Andersen, Biol. Neonate, 74:372-5 (1998), the disclosure of which is 
incorporated herein by reference in its entirety. 

Furthermore, the protein of SEQ ID NO:289 or fragments thereof are useful as a reagent for 
analyzing the control of gene expression by interferons and other cytokines in both normal and 
diseased cells. The protein of the SEQ ID NO:289 or fragments thereof may be used to identify 

25 specific molecules with which it binds such as agonists, antagonists or inhibitors. 

Another embodiment of the present invention relates to methods of using the protein of 
SEQ ID NO:289 or fragments thereof to identify and/or quantify cytokines of the interferon family 
as well as other cytokines such as IL-10 and tumor antigens, which may interact with the protein of 
the invention. 

30 The protein of SEQ ID NO:289 or fragments thereof may also be included in 

pharmaceutical preparations for treating cancer or prevention/treatment of other diseases associated 
with changes in expression of the protein of the invention. In another embodiment of the present 
invention, the protein of SEQ ID NO:289 or fragments thereof is used to inhibit and/or modulate the 
effect of cytokines and related molecule such as 11-2, TNF alpha, CTLA4, CD28, and others, by 

35 preventing the binding of the endogenous cytokine to their natural receptors, thereby blocking cell 
proliferation or inhibitory signals generated by the ligand-receptor binding event. 
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The protein of SEQ ID NO:289 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation and tumor models, by injecting the protein 
either intra peritoneally intravenously, subcutaneously or directly into the diseased tissue. 

The DNA encoding the protein of SEQ ID NO:289 or fragments thereof is useful in 
5 diagnostic assays for conditions/diseases associated with expression of the protein of the invention. 
The diagnostic assay is useful to distinguish between absence, presence, and excess expression of 
the protein of the invention and to monitor regulation of levels of the protein of the invention during 
therapeutic intervention. The DNA may also be incorporated into effective eukaryotic expression 
vectors and directly targeted to a specific tissue, organ, or cell population for use in gene therapy to 

10 treat the above mentioned conditions, including tumors and/or to correct disease- or genetic-induced 
defects in any of the above mentioned proteins including the protein of the invention. The DNA 
may also be used to design antisense sequences and ribozymes, which can be administered to 
modify gene expression in tumor and pathogen-infected cells and to influence expression of 
cytokines and growth factors. In vivo delivery of genetic constructs into subjects can be developed 

15 to the point of targeting specific cell types, such as tumor where expression of the protein of the 
invention may be affected or is modulating the expression and/or activity of other proteins such as 
cytokines, growth factors, their receptors and/or tumor antigens. It is also useful to detect unknown 
upstream sequences (e. g. promoters and regulatory elements) by standard techniques and for 
research into the control of gene expression by interferons and other cytokines, as well as growth 

20 and transcription factors in normal and diseased cells. Hybridization probes are useful to detect 

DNA encoding the protein of the invention (or closely related molecules) in biological samples, and 
for mapping the naturally occurring genomic sequence to a particular chromosome/chromosome 
region. The DNA may be used to generate and/or treat in vivo animal models of disease, including 
susceptibility or resistance to infection, inflammation, tumors and autoimmune conditions, as well 

25 as tumor therapy, based on vaccine, knock-out and transgene technologies. 

Antibodies against the protein of SEQ ID NO:289 or fragments thereof are useful for the 
diagnosis of conditions and diseases associated with its expression and to quantify the protein of the 
invention (e. g. in assays to monitor patients during therapeutic intervention). Antibodies specific 
for the protein may include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 

30 Fab fragments produced by a Fab expression library. Neutralizing antibodies are especially 

preferred for diagnostics and therapeutics. Diagnostic assays for the protein of the invention include 
methods utilizing the antibody and a label to detect the protein of the invention in human body 
fluids or extracts of cells or tissues. 

The protein of the invention and its catalytic or immunogenic fragments or oligopeptides 

35 thereof, can be used for screening therapeutic compounds in any variety of drug screening 

techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 
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(PCR), RT-PCR, RNAse protection, Northern and western blotting, enzyme-linked immunosorbent 
asay (ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 
immunoprecipitation, and chromatography. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:289, 
5 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition in an individual. For example, the condition may be cancer, including breast 
cancer, viral infection, bacterial infection, inflammation, autoimmune disorders, rheumatoid 
arthritis, Graves disease, systemic lupus erythematosus, autoimmune hepatitis, Wegener's 

10 granulomatosis, sarcoidosis, polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's 
syndrome, inflammatory bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, 
Type I diabetes, insulin-dependent diabetes mellitus, Lupus Nephritis, and allergic 
encephalomyelitis; proliferative disorders including various forms of cancer such as leukemias, 
lymphomas (Hodgkins and non-Hodgkins), sarcomas, melanomas, adenomas, carcinomas of solid 

15 tissue, hypoxic tumors, squamous cell carcinomas of the mouth, throat, larynx, and lung, 

genitourinary cancers such as cervical and bladder cancer, hematopoietic cancers, head and neck 
cancers, and nervous system cancers, benign lesions such as papillomas, atherosclerosis, 
angiogenesis; viral infections, in particular HCV and HIV infections, as well as other pathogen- 
induced infections (e. g. leishmania). 

20 In such embodiments, the protein of SEQ ID NO:289, or a fragment thereof, is 

administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ID NO:289. The protein of SEQ ID NO:289 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ID NO:289 or a fragment thereof may be administered to the individual. Alternatively, an agent 

25 which increases the activity of the protein of SEQ ID NO:289 may be administered to the 

individual. Such agents may be identified by contacting the protein of SEQ ID NO:289 or a cell or 
preparation containing the protein of SEQ ID NO:289 with a test agent and assaying whether the 
test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

30 Alternatively, the activity of the protein of SEQ ID NO:289 may be decreased by 

administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:289 may be identified by contacting the protein of 
SEQ ID NO:289 or a cell or preparation containing the protein of SEQ ID NO:289 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 

35 may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 
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In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, fetal brain, kidney, fetal kidney, or colon, or to distinguish between two or more 
possible sources of a sample on the basis of the level of the protein of SEQ ID NO:289 in the 
5 sample. For example, the protein of SEQ ID NO:289 or fragments thereof may be used to generate 
antibodies using any techniques known to those skilled in the art, including those described therein. 
Such antibodies may then be used to identify tissues of unknown origin, for example, forensic 
samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to differentiate 
different tissue types in a tissue cross-section using immunochemistry. In such methods a sample is 

10 contacted with the antibody, which may be detectably labeled, under conditions which facilitate 
antibody binding. The level of antibody binding to the test sample is measured and compared to the 
level of binding to control cells from brain, fetal brain, kidney, fetal kidney, or colon or tissues other 
than brain, fetal brain, kidney, fetal kidney, or colon to determine whether the test sample is from 
brain, fetal brain, kidney, fetal kidney, or colon. Alternatively, the level of the protein of SEQ ID 

15 NO:289 in a test sample may be measured by determining the level of RNA encoding the protein of 
SEQ ID NO:289 in the test sample. RNA levels may be measured using nucleic acid arrays or 
using techniques such as in situ hybridization, Northern blots, dot blots or other technques familiar 
to those skilled in the art. If desired, an amplification reaction, such as a PCR reaction, may be 
performed on the nucleic acid sample prior to analysis. The level of RNA in the test sample is 

20 compared to RNA levels in control cells from brain, fetal brain, kidney, fetal kidney, or colon or 
tissues other than brain, fetal brain, kidney, fetal kidney, or colon to determine whether the test 
sample is from brain, fetal brain, kidney, fetal kidney, or colon. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:289, 

25 including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:289 or a fragment thereof may be fixed to a solid support, such as a 
chromatograpy matrix. A prepartation containing cells expressing the protein of SEQ ID NO:289 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
support is washed and then the cells are released from the support by contacting the support with 

30 agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:289 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:289. In such techniques, the level of the protein of SEQ ID NO:289 in 
an ill individual is measured using techniques such as those described herein. The level of the 

35 protein of SEQ ID NO:289 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO:289 which is associated 
with disease. 
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Protein of SEP ID NQ:268 (internal designation 1 16-1 1 1-4-0-B3-CS) 



The protein of SEQ ID NO:268 is encoded by the cDNA of SEQ ID NO:27. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:268 
described throughout the present application also pertain to the polypeptide encoded by the human 
5 cDNA of clone 1 16-1 1 1-4-0-B3-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ID NO:27 described throughout the present application also pertain 
to the human cDNA of clone 1 16-1 1 1-4-0-B3-CS. The protein of the invention is found to be 
expressed in testis and lungs. 

The protein of SEQ ID NO:268 encoded by the extended cDNA SEQ ID NO: 27 is a 

10 splicing variant of XAGE-1, a member of the CT antigen family overexpressed in Ewing sarcoma 
(Liu, X. F., L. J. Helman, et al. (2000). Cancer Res 60(17): 4752-5, the disclosures of which are 
incorporated by reference herein in their entireties). In addition, the protein of SEQ ID NO:268 also 
shows strong homology at the COOH end with PAGE4, another member of the CT antigen family 
(Brinkmann, U., G. Vasmatzis, et al. (1999) Cancer Res 59(7): 1445-8, the disclosure of which is 

15 incorporated herein by reference in its entirety). 

The cDNA SEQ ID NO:27 is composed of 5 exons. Exon 1 lies between nucleotides 1-245, 
exon2 lies between nucleotides 246-370, exon 3 lies between nucleotides 371-512, exon 4 lies 
between nucleotides 513-639, and exon 5 lies between nucleotides 640-762 . Exons 2 to 5 of cDNA 
SEQ ID NO:27 are shared in part with XAGE-1 . However, since the initiation codon of SEQ ID 

20 NO: 27 is located in intronl of XAGE-1, there is a frameshift in the alignment of the 2 molecules. 
Exon 1 of SEQ ID NO:27 lies between nucleotides 1 10-234 of XAGE-1, exon 2 of SEQ ID NO:27 
lies between nucleotides 235-376 of XAGE-1, exon 3 of SEQ ID NO:27 lies between nucleotides 
377-503 of XAGE-1, and exon 4 of SEQ ID NO:27 lies between nucleotides 504-526 of XAGE-1. 
XAGE-1 is overexpressed in sarcoma and alveolar rhabdomyosarcoma and is also highly 

25 expressed in normal testis (Liu, X. F., L. J. Helman, et al. (2000). Cancer Res 60(17): 4752-5, the 
disclosure of which is incorporated herein by reference in its entirety). In addition XAGE-1 share 
homology with PAGE-4 (Brinkmann, U., G. Vasmatzis, et al. (1999) Cancer Res 59(7): 1445-8, the 
disclosure of which is incorporated herein by reference in its entirety) at the COOH end. 

CT antigens are a distinct class of differentiation antigens that are expressed by cancers 

30 arising in nonessential normal tissues such as prostate, breast, and ovary (G. Vasmatzis et al., Proc. 
Natl. Acad. Sci. USA, 95: 300-304, 1998, the disclosure of which is incorporated herein by 
reference in its entirety) and that have a restricted pattern of expression in normal tissues. This 
class of antigens are presented on the surface of tumor cells and are recognized by cytolytic T cells, 
leading to lysis. The extent to which these antigens have been studied, has been via cytolytic T cell 

35 characterization studies, in vitro i.e., the study of the identification of the antigen by a particular 
cytolytic T cell ("CTL" hereafter) subset. The subset proliferates upon recognition of the presented 
tumor rejection antigen, and the cells presenting the antigen are lysed. Characterization studies have 
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identified CTL clones which specifically lyse cells expressing the antigens. Examples of this work 
may be found in Levy et al., Adv. Cancer Res. 24: 1-59 (1977); Boon et al., J. Exp. Med. 152: 1 184- 
1 193 (1980); Brunner et al., J. Immunol. 124: 1627-1634 (1980) ; Maryanski et al., Eur. J. 
Immunol. 124: 1627-1634 (1980); Maryanski et al., Eur. J. Immunol. 12: 406-412 (1982); Palladino 
5 et al., Cane. Res. 47: 5074-5079 (1987), the disclosures of which are incorporated herein by 
reference in their entireties. 

Some throughly studied CT antigens are MAGE, BAGE, GAGE and LAGE, others have 
been added including PAGE, XAGE, most of them located on chromosome X. Brinkmann et Al 
reported the identification of three new members of the GAGE/ PAGE family, termed XAGEs. 
10 XAGE-1 and XAGE-2 are expressed in Ewing's sarcoma, rhabdomyosarcoma, a breast cancer, and 
a germ cell tumor. 

It is believed that the protein of SEQ ID NO:268 is a splicing variant of XAGE-1, a CT 
antigen overexpressed in Ewing sarcoma. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:268, 

15 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition, such as those listed above, associated with over or under expression of the 
protein of SEQ ID NO:268. In such embodiments, the protein of SEQ ID NO:268, or a fragment 
thereof, is administered to an individual in whom it is desired to increase or decrease any of the 

20 activity of the protein of SEQ ID NO:268. The protein of SEQ ID NO:268 or fragment thereof may 
be administered directly to the individual or, alternatively, a nucleic acid encoding the protein of 
SEQ ID NO:268 or a fragment thereof may be administered to the individual. Alternatively, an 
agent which increases the activity of the protein of SEQ ID NO:268 may be administered to the 
individual. Such agents may be identified by contacting the protein of SEQ ID NO:268 or a cell or 

25 preparation containing the protein of SEQ ID NO:268 with a test agent and assaying whether the 
test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:268 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 

30 with the activity of the protein of SEQ ED NO:268 may be identified by contacting the protein of 
SEQ ID NO:268 or a cell or preparation containing the protein of SEQ ID NO:268 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

35 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify tissues, preferably testis and 
lungs, or to distinguish between two or more possible sources of a tissue sample on the basis of the 
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level of the protein of SEQ ID NO:268 in the sample. For example, the protein of SEQ ID NO:268 
or fragments thereof may be used to generate antibodies using any techniques known to those 
skilled in the art, including those described therein. Such tissue-specific antibodies may then be 
used to identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue 
5 that has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue 
cross-section using immunochemistry. In such methods a tissue sample is contacted with the 
antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 
level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from testis or lungs or tissues other than testis or lungs to determine whether the test 

10 sample is from testis or lungs. Alternatively, the level of the protein of SEQ ID NO:268 in a test 
sample may be measured by determining the level of RNA encoding the protein of SEQ ID NO:268 
in the test sample. RNA levels may be measured using nucleic acid arrays or using techniques such 
as in situ hybridization, Northern blots, dot blots or other technques familiar to those skilled in the 
art. If desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic 

15 acid sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in 
control cells from testis or lungs or tissues other than testis or lungs to determine whether the test 
sample is from testis or lungs. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:268, 

20 including Ewing sarcoma cells, rhabdomyosarcoma cells, breast cancer cells and germ cell tumor 
cells using methods known to those skilled in the art. For example, an antibody against the protein 
of SEQ ID NO:268 or a fragment thereof may be fixed to a solid support, such as a chromatograpy 
matrix. A prepartation containing cells expressing the protein of SEQ ID NO:268 is placed in 
contact with the antibody under conditions which facilitate binding to the antibody. The support is 

25 washed and then the cells are released from the support by contacting the support with agents which 
cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:268 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:268. In some embodiments, the protein of SEQ ID NO:268 or fragments 

30 thereof may be used to diagnose Ewing sarcoma, rhabdomyosarcoma, breast cancer or germ cell 
tumors. In such techniques, the level of the protein of SEQ ID NO:268 in an ill individual is 
measured using techniques such as those described herein. The level of the protein of SEQ ID 
NO:268 in the ill individual is compared to the level in normal individuals. An elevated level or 
decreased level of the protein of SEQ ID NO:268 relative to normal individuals suggests that the ill 

35 individual is suffering from a defect in intercellular communication or secretion. 

Another embodiment of the invention relates to compositions and methods using the protein 
of SEQ ID NO:268 or a fragment thereof as possible targets for vaccine -based therapies of cancer, 
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including Ewing sarcoma, rhabdomyosarcoma, breast cancer or germ cell tumors. In such 
embodiments, an antibody against against the protein of SEQ ID NO:268 or a fragment thereof is 
administered to an individual suffering from cancer in an amount sufficient to ameliorate or 
eliminate the cancer. 

5 Protein of SEP ID NO:399 (internal designation ( 160 -40-1 -0-H4-CS) 

The protein of SEQ ID NO:399 is encoded by the cDNA of SEQ ID NO: 1 58. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:399 
described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 1 60-40-1 -0-H4-CS. In addition, it will be appreciated that all characteristics and 

1 0 uses of the nucleic acid of SEQ ID NO: 158 described throughout the present application also 

pertain to the human cDNA of clone 1 60-40- 1-0-H4-CS. The protein of the invention is found to be 
expressed in testis and lungs. It is over represented in fetal brain. 

The protein of SEQ ID NO:399 encoded by the cDNA of SEQ ID NO: 1 58 is homologous to 
proteins of the Phosphatic Acid Phosphatase type 2 (PAP2) superfamily (Stukey J. and Carman 

15 G.M., Protein Sci 1997;6 :469-472, the disclosure of which is incorporated herein by reference in its 
entirety). Three variants of human PAP, i.e. PAP-alpha 2 (W79285) and its alternatively spliced 
form PAP-alpha 1 (W79284), PAP-beta (W79286) and PAP-gamma (W79287) have been 
identified. The protein of SEQ ID NO:399 displays a pfam characteristic domain of the PAP2 
superfamily from positions 19 to 175. Accordingly, one embodiment of the present invention is a 

20 polypeptide comprising amino acid residues 19 to 175 of SEQ ID NO:399. Four membrane 
spanning domains are predicted from amino acid ositions 17 to, 47 to 67, 108 to 128, and 141 to 
161 . Accordingly, another embodiment of the present invention is a polypeptide comprising one or 
more of the foregoing membrane spanning domains. 

Phosphatidic acid phosphatase (PAP) (also referred to as phosphatidate phosphohydrolase) 

25 is known to be an important enzyme for glycerolipid biosynthesis. In particular, PAP catalyzes the 
conversion of phosphatidic acid (PA) into diacylglycerol (DAG). PA and DAG are lipids involved 
in signal transduction and in structural membrane-lipid biosynthesis in cells, thus they represent an 
important regulatory point in eukariotic phospholipid metabolism. DAG is a well-studied lipid 
second messenger which is essential for the activation of protein kinase C (Kent; Anal. Rev. 

30 Biochem. ; 64 : 3 15-343; 1995; whereas PA itself is also a lipid messenger implicated in various 
signaling pathways such as NADPH oxidase activation and calcium mobilization (English; Cell 
Signal.; 8:341-347 ;1996, the disclosure of which is incorporated herein by reference in its entirety). 
The regulation of PAP activity can therefore affect the balance of divergent signaling processes that 
the cell receives in terms of PA and DAG (Brindley et al.; Chem.Phys. Lipids 80:45-57 ; 1996, the 

35 disclosure of which is incorporated herein by reference in its entirety). 
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PAP exists in at least two isoforms, one of which (PAP1) is presumed to be cytosolic and 
membrane associated and the other (PAP2) to be an integral membrane protein (Leung D.W., 
Tompkins C.K., White T. ; DNA Cell Biol. 17 : 377-385 (1998)). The protein of the invention has 
180 amino-acids and four predicted membrane-spanning segments, so is presumed to be an integral 
5 membrane protein. 

The protein of SEQ ID NO:399 is encoded by a cDNA that has homology to many forms 
of alternative splicing of PAP2 genes. For example, the protein of SEQ ID NO:399 has 29% 
homology with human phosphatidic acid phosphohydrolase type-2C protein. The protein of SEQ ID 
NO:399 also has 40% homology with human phosphatidic acid phosphatase 2B protein. In 

10 addition, the protein of SEQ ID NO:399 has 33% homology with human type 2 phosphatidic acid 
phosphatase alpha-2 protein. PAP2-alpha2 is one of the two isoforms with PAP2-alphal, presumed 
to be alternative splice variants from a single gene. 

Northern analysis has shown that PAP2-alpha mRNA expression was suppressed in several 
tumor tissues, indicating that PAP-2 may act as a tumor suppressor. The relationship of PAP and 

15 tumor suppression is further evidenced in findings that PAP activity is lower in fibroblast cell lines 
transformed with either the ras or fps oncogene than in the parental rati cell line (Brindley et al ; 
Chem. Phys. Lipids 80 : 45-57 ;1996, the disclosure of which is incorporated herein by reference in 
its entirety). As discussed above, a decrease in PAP activity in transformed cells correlates with a 
concomitant increase in PA concentration. Moreover, elevated PAP activity and lower levels of PA 

20 have been observed in contact-inhibited fibroblasts relative to proliferating and transformed 
fibroblasts (Brindley et al ; Chem. Phys. Lipids 80: 45-57; 1996, the disclosure of which is 
incorporated herein by reference in its entirety). Therefore, the protein of SEQ ID NO:399 or 
fragments thereof may be used to decrease cell division and as such can provide a useful tool in 
treating cancer. Subsequent analysis of colon tumor tissue derived from four donors confirmed 

25 lower expression of PAP2-alpha than in matching normal colon tissue. Considering these data and 
previous demonstrations that certain transformed cell lines have lower PAP activity, human PAP 
cDNAs may be used for gene therapy for certain tumors (Leung D.W., Tompkins C.K., White 
T. ; DNA Cell Biol. 17 : 377-385 (1998), the disclosure of which is incorporated herein by reference 
in its entirety). Accordingly, one embodiment of the present invention is the use of the protein of 

30 SEQ ID NO:399 or a fragment thereof as a tumor suppressor. For example, a nucleic acid 
expressing the protein of SEQ ID NO:399 or a fragment thereof may be introduced into an 
individual suffering from cancer in order to ameliorate or eliminate the cancer. In fact, nucleic 
acids encoding human phosphatidic acid phosphatases have been used to regulate levels of lipid 
cellular mediators and in gene therapy of e.g. cancer (PCT publication WO98/46730, the disclosure 

35 of which is incorporated herein by reference in its entirety). 

In another embodiment of the present invention, the protein of SEQ ID NO:399 or a 
fragment thereof can be used to control the balance of lipid mediators of cellular activation and 
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signal transduction. The protein of the invention has 33% homology with human phosphatidic acid 
phosphatase 2A protein. PAP2A is an integral membrane glycoprotein at the cell surface that plays 
an active role in the hydrolysis and uptake of lipids from the extracellular space (Roberts RZ, 
Morris AJ; Biochim Biophys Acta 2000 Aug 24;1487(l):33-49, the disclosure of which is 
5 incorporated herein by reference in its entirety). Accordingly, the level or activity of the protein of 
SEQ ID NO:399 may be modulated to influence the rate or extent of hydrolysis and uptake of lipids 
from the extracellular space using methods such as those described herein. 

In another embodiment of the present invention, the protein of SEQ ID NO:399 can be used 
to counterbalance the inflammatory response. PA has been implicated in cytokine induced 

10 inflammatory responses (Bursten et al; Circ. Shok 44: 14-29, 1994; Abraham et al; J. Exp. Med. 
181: 569-575, 1995; Rice etal; PNAS 91: 3857-3861, 1994; Leung et al; PN AS 92: 4813-4817, 
1995, the disclosures of which are incorporated herein by reference in their entireties) and the 
modulation of numerous protein kinases involved in signal transduction (English et al ; Chem. Phys. 
Lipids 80: 1 17-132, 1996, the disclosure of which is incorporated herein by reference in its 

15 entirety). In addition, a nucleic acid encoding the protein of SEQ ID NO:399 or a fragment thereof 
may be used to counterbalance the inflammatory response from cytokine stimulation through 
degradation of excess amount of PA in cells or to treat or ameliorate inflammatory diseases. 

The gene encoding the protein of SEQ ID NO:399 or a fragment thereof can also be used in 
gene therapy for the treatment of obesity associated with diabetes. PAP activity is decreased in the 

20 livers and hearts of the grossly obese and insulin resistant JCR:LA corpulent rat compared to the 
control lean phenotype (Brindley et al ; Chem. Phys. Lipids 80 : 45-57 ;1996, the disclosure of 
which is incorporated herein by reference in its entirety). The protein of the invention therefore can 
provide an important tool for the treatment of obesity associated with diabetes. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO: 399 , 

25 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition, such as those listed above, in an individual. In such embodiments, the 
protein of SEQ ID NO:399 , or a fragment thereof, is administered to an individual in whom it is 
desired to increase or decrease any of the activities of the protein of SEQ ID NO:399, including 

30 glycerolipid biosynthesis, conversion of phasphatidic acid into diacylglycerol, signal transduction, 
membrane-lipid biosynthesis, activation of protein kinase C, NADPH oxidase activation, calcium 
mobilization, cell division, production of diacylglycerol, monoacylglycerol, ceramide or 
sphingosine, modulation of the inflammatory response or dephosphorylation of a substrate such as 
lysophasphatidic acid, ceramide 1 -phosphate, or sphingosine 1 -phosphate, or treatment or 

35 amelioration of obesity associated with diabetes. The protein of SEQ ID NO: 3 99 or fragment 

thereof may be administered directly to the individual or, alternatively, a nucleic acid encoding the 
protein of SEQ ID NO:399 or a fragment thereof may be administered to the individual. 
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Alternatively, an agent which increases the activity of the protein of SEQ ID NO:399 may be 
administered to the individual. Such agents may be identified by contacting the protein of SEQ ID 
NO:399 or a cell or preparation containing the protein of SEQ ID NO:399 with a test agent and 
assaying whether the test agent increases the activity of the protein. For example, the test agent 
5 may be a chemical compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:399 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:399 may be identified by contacting the protein of 
SEQ ID NO: 399 or a cell or preparation containing the protein of SEQ ID NO:399 with a test agent 

10 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably brain, or 

15 to distinguish between two or more possible sources of a tissue sample on the basis of the level of 
the protein of SEQ ID NO:399 in the sample. For example, the protein of SEQ ID NO:399 or 
fragments thereof may be used to generate antibodies using any techniques known to those skilled 
in the art, including those described therein. Such tissue -specific antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue that 

20 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a tissue sample is contacted with the antibody, 
which may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from brain or tissues other than brain to determine whether the test sample is from brain. 

25 Alternatively, the level of the protein of SEQ ID NO:399 in a test sample may be measured by 
determining the level of RNA encoding the protein of SEQ ID NO: 3 99 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 
Northern blots, dot blots or other technques familiar to those skilled in the art. If desired, an 
amplification reaction, such as a PCR reaction, may be performed on the nucleic acid sample prior 

30 to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
brain or tissues other than brain to determine whether the test sample is from brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO: 3 99 , 
including using methods known to those skilled in the art. For example, an antibody against the 

35 protein of SEQ ID NO:399 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A prepartation containing cells expressing the protein of SEQ ID NO:399 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO: 3 99 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ID NO:399. In some embodiments, the protein of SEQ ID NO:399 or fragments 
thereof may be used to diagnose cancer. In such techniques, the level of the protein of SEQ ID 
NO:399 in an ill individual is measured using techniques such as those described herein. The level 
of the protein of SEQ ID NO:399 in the ill individual is compared to the level in normal individuals. 
An elevated level or decreased level of the protein of SEQ ID NO:399 relative to normal individuals 
10 suggests that the ill individual may suffer from cancer or be predisposed to getting cancer in the 
future. 

In another embodiment, the present invention relates to methods of preparing a PAP protein 
of SEQ ID NO:399 comprising the steps of (i) transforming a host cell with an expression vector 
comprising a polynucleotide encoding SEQ ID NO:399, (ii) culturing the transformed host cells 

15 which express the protein and (iii) isolating the protein. The present invention also relates to a 
method of dephosphorylating a substrate comprising contacting the substrate with an effective 
amount of isolated protein of SEQ ID NO:399 or a fragment thereof such that the protein catalyzes 
the dephosphorylation of the substrate. It is further provided that this method occurs in vitro, and 
comprises a step of isolating the dephosphorylated substrate. Additionally, the method can occur in 

20 vivo, and is effected by the administration of the protein of the invention (or part of it) to a mammal 
in need thereof. 

Protein of SEP ID NOs:258 and 262 (internal designations 1 1 0-007-1 -0-C7-CS, 1 16-055-1 -0-A3- 
CS): 

The protein of SEQ ID NO:258 is encoded by the cDNA of SEQ ID NO: 17. Accordingly, 
25 it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:258 

described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 1 10-007-1 -0-C7-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ID NO: 17 described throughout the present application also pertain 
to the human cDNA of clone 1 10-007-1 -0-C7-CS. The protein of SEQ ID NO:258 shows 
30 homologies to two high affinity IgE receptor-like proteins (IGER) with GENESEQP accession 
numbers W96745 and W41056, the disclosures of which are incorporated herein by reference in 
their entireties. The protein of SEQ ID NO:258 is expressed in liver and testis. The protein of SEQ 
ID NO:262, encoded by SEQ ID NO:21 , is a variant of the protein of SEQ ID NO:258 and shares 
all the potential uses and functions described herein. This protein and cDNA share all of the 
35 characteristics and uses of the clone, and product thereof, 1 16-055-1 -0-A3-CS). 
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Like the two high affinity IgE receptor-like proteins, the protein of the invention contains 
four transmembrane spanning domains of 20 amino acids, between amino acids 53-73, 79-99, 121- 
141 and 158-178, respectively. The protein of SEQ ID NO:258 crosses the plasma membrane four 
times forming two small extracellular loops and has both the N- or C- terminals in the cytoplasm. 
5 Moreover, the protein of the invention contains a signal peptide (cleavage site at position 21). 

The predicted structure of the protein of SEQ ID NO:258 demonstrates the relationship of 
this protein to FceRIp and CDC20 antigen and provides evidence for a family of 4-transmembrane 
spanning proteins. The conservation of amino acids between all three proteins is highest in the four 
transmembrane domains. While greater divergence exists in the hydrophilic amino and carboxyl 
10 termini, several amino acids within these regions are conserved such as the presence of 4 prolines in 
the amino terminus of all three proteins. In addition, two cysteine residues (position 147 and 156) 
are present in the second extracellular domain between TM3 and TM4. This suggests that inter- or 
intra-molecular di-sulfite bonds in this domain are present in all three proteins. 

FceRI, is part of a tetrameric receptor complex consisting of an a chain, a (3 chain and two y 
15 chains (Kinet et al. Proc Natl. Acad. Sci. USA, 15: 6483-6487 (1988), the disclosure of which is 
incorporated herein by reference in its entirety). Together, they mediate interaction with IgE-bound 
antigens leading to dramatic cellular responses, such as the massive degranulations of mast cells. 
The p subunit is a 4-transmembrane protein with both the amino and carboxyl termini residing in 
the cytoplasm. 

20 Chromosome mapping localized cDNA of SEQ ID NO: 17 to chromosome 1 lql2, the 

location of the CD20 gene. However, the murine FceRIp and Ly-44 (the murine equivalent of 
CD20) are both located in the same position in mouse in chromosome 19 (Teder, T.F. et al., J. 
Immunol. 141:4388-4394 (1988), Clark E.A. and Lane, J.L. Annu. Rev. Immunol. 9:97-127 (1991), 
the disclosures of which are incorporated herein by reference in their entireties). Therefore, the 

25 three genes are believed to have been originated and evolved from the same locus, further 
supporting the proposition that they are members of the same family of related proteins. 

On the basis of the foregoing information, it is believed that the protein of SEQ ID NO:258 
is a high affinity immunoglobulin E receptor-like protein. 

Atopic diseases, which include allergy, asthma, atopic dermatitis (or eczema) and allergic 

30 rhinitis are generally defined as a disorder of Immunoglobulin E (IgE) responses to common 

antigens, such as pollen or house dust mites. It is frequently detected by either elevated total serum 
IgE levels, antigen specific IgE response or positive skin tests to common allergens. In principle, 
atopy can result from dysregulation of any part of the pathway which begins with antigen exposure 
and IgE response to the interaction of IgE with its receptor on mast cell, the high affinity Fc 

35 receptor FceRI, and the subsenquent cellular activation mediated by that ligand-receptor 

engagement (Ravetch, Nature Genetics, 7: 117-118 (1994), the disclosure of which is incorporated 
herein by reference in its entirety). 
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Accordingly, the protein of SEQ ID NO:258 or fragments comprising at least 5, 8, 10, 12, 
15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 consecutive amino acids thereof, or fragments 
having a desired biological activity may administered to an individual in whom it is desired to 
increase or decrease the activity of the protein of SEQ ID NO:258. In particular, the protein of SEQ 
5 ID NO:258 or fragment thereof may be administered to an individual in whom it is desired to 
regulate the extent of the IgE response. In such methods, the protein of SEQ ED NO:258 or 
fragment thereof may be administered directly to the individual or, alternatively, a nucleic acid 
encoding the protein of SEQ ID NO:258 or a fragment thereof may be administered to the 
individual. Alternatively, an agent which increases the activity of the protein of SEQ ID NO:258 

10 may be administered to the individual. Such agents may be identified by contacting the protein of 
SEQ ID NO:258 or a cell or preparation containing the protein of SEQ ID NO:258 with a test agent 
and assaying whether the test agent increases the activity of the protein. For example, the test agent 
may be a chemical compound or a polypeptide or peptide. 

The protein of SEQ ID NO:258 or fragments thereof may also be used to identify genes or 

15 polypeptides that may play a role in IgE responses or atopic disease. In particular, binding partners 
for the protein of SEQ ID NO:258 or the genes encoding such binding partners may be identified 
using a variety of techniques familiar to those skilled in the art, including the techniques described 
herein. 

The protein of SEQ ID NO:258 or the polynucleotide encoding the protein of SEQ ID 

20 NO:258 may also be used to diagnose hereditary atopy. In particular, the level of the protein of 
SEQ ID NO:258 may be determined in a test individual using methods such as those described 
herein and compared to the levels of normal individuals and individuals suffering from hereditary 
atopy to determine whether the test individual is suffering from or at risk of suffering hereditary 
atopy. Alternatively, a nucleic acid sample may be obtained from a test individual and analyzed to 

25 determine whether it contains a level of RNA encoding the protein of SEQ ID NO:258 which is 
associated with hereditary atopy or a mutation in the gene encoding the protein of SEQ ID NO:258 
which is associated with hereditary atopy. For example, a nucleic acid sample from the test 
individual may be contacted with a nucleic acid probe comprising the nucleic acid encoding the 
protein of SEQ ID NO:258 or a fragment thereof to determine the RNA level or whether the 

30 individual has a mutation associated with hereditary atopy. The probe may be either DNA, 

including cDNA or genomic DNA, or the probe may be RNA. Any of the methods familiar to those 
skilled in the art may be used in these diagnostic methods, including the methods described herein. 
For example, the presence of a mutation associated with hereditary atopy can be determined using 
methods generally known in the art, such as but not limited to PCR, sequencing or mini sequencing 

35 as described in the method of Yamamoto et al. (Biochem. Biophys. Res. Comm., 182:507 (1992), 
the disclosure of which is incorporated by reference herein in its entirety). 
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The protein of SEQ ID NO:258 can also be used to characterize the induction of expression 
of FcsRI and the particular function of FceRIp. As such, the protein of the invention can be useful 
in, for example, the design of drugs that block or inhibit induction or activity of FceRI, thereby 
treating atopic diseases. In particular, test agents which block or inhibit induction or activity may 
5 be identified using the methods described herein. 

In an other embodiment, the protein of SEQ ID NO:258 can be employed in the preparation 
of antibodies, such as monoclonal antibodies, according to methods known in the art, including 
those described herein. The antibodies can be used to block or mimic ligand binding to the receptor 
comprising the protein of the invention or other receptors, such as but not limited to FceRI. The 

10 antibodies can also be used to isolate the protein of SEQ ID NO:258 or cells which express the 
protein of SEQ ID NO:258 using methods such as those described herein. For example, the 
antibodies may be used to measure the presence of cells containing the protein of SEQ ID NO:258 
(including but not limited to hematopoietic cells) in a sample. For example, the method comprises 
contacting the sample with the antibody under conditions sufficient for the antibody to bind to the 

15 protein of SEQ ID NO:258 and detecting the presence of bound antibody using methods known in 
the art, including those described herein. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably liver and 
testis, or to distinguish between two or more possible sources of a tissue sample on the basis of the 

20 level of the protein of SEQ ED NO:258 in the sample. For example, the protein of SEQ ID NO:258 
or fragments thereof may be used to generate antibodies using any techniques known to those 
skilled in the art, including those described therein. Such tissue-specific antibodies may then be 
used to identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue 
that has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue 

25 cross-section using immunochemistry. In such methods a tissue sample is contacted with the 

antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 
level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from liver or testis or tissues other than liver or testis to determine whether the test 
sample is from liver or testis. Alternatively, the level of the protein of SEQ ID NO:258 in a test 

30 sample may be measured by determining the level of RNA encoding the protein of SEQ ID NO:258 
in the test sample. RNA levels may be measured using nucleic acid arrays or using techniques such 
as in situ hybridization, Northern blots, dot blots or other technques familiar to those skilled in the 
art. If desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic 
acid sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in 

35 control cells from liver or testis or tissues other than liver or testis to determine whether the test 
sample is from liver or testis. 
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Protein of SEP ID NO:279 (internal designation 160-58-3-0-H3-CS) 

The protein of SEQ ID NO:279 is encoded by the cDNA of SEQ ID NO:38. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:279 
described throughout the present application also pertain to the polypeptide encoded by a nucleic 
5 acid included in clone 160-58-3-0-H3-CS. In addition, it will be appreciated that all characteristics 
and uses of the nucleic acid of SEQ ID NO: 38 described throughout the present application also 
pertain to the nucleic acid included in clone 160-5 8-3 -0-H3-CS. 

The protein of SEQ ID NO:279 is encoded by a nucleic acid of 1330 nucleotides with an 
ORF between nt 198 to 998 yielding a 267 amino acid protein. The protein is a polymorphic variant 

10 of the sequence (SP:P01210) for proenkephalin A precursor (contains Met- and Leu- enkephalins). 
It has a signal peptide spanning 24 amino acid and 2 signature motifs for vertebrate endogenous 
opioid neuropeptides and endogenous opioid neuropeptide precursors. PSORT gives a predicted 
extracellular localization, including the cell wall (66.7%). The protein of SEQ ID NO:279 is 
primarily distributed the fetal brain, although expression in other tissues has also been shown (see 

15 below). The polymorphic variation is found at amino acid position 75 (E->D, a conservative amino 
acid change). After signal peptide cleavage (amino acid 47 to 267; 220 amino acid), the protein still 
contains the polymorphic variation, which is now at amino acid position 29. This does not change 
any of the sequence of the different enkephalins that result after cleavage of this precursor protein. 
In addition, the polymorphism is 25 amino acids away from the first cleavage site on the amino 

20 terminal side. This is unlikely to change the secondary structure of the actual cleavage site. 

PCT publication WO9606863-A1, the disclosure of which is incorporated herein by 
reference in its entirety, discloses a protein having high homology with the protein of SEQ ID 
NO:279. Accordingly, the protein of SEQ ID NO:279 is believed to be an enkephalin. Met- 

and Leu- enkephalins compete with and mimic the effects of opiate drugs. These two pentapeptides 

25 with potent opiate agonist activity in bioassay systems were originally identified by Hughes et al 
(Nature, 258, 577-580, 1975). The natural ligands for opiate receptors, which differ only in their 
COOH terminal amino acid, were named Met- and Leu-enkephalin to reflect their origin from the 
brain. Peptides containing these sequences are termed opiate or opioid peptides. Enkephalins are 
widely distributed throughout the central nervous system in enkephalinergic neuronal networks, and 

30 also exist in the peripheral nervous system, for example in autonomic ganglia. Data, largely 
circumstantial, suggest wide-ranging involvement of endogenous opioids for example in the 
modulation of pain perception, in mood and behaviour, learning and memory, responses to stress, 
diverse neuroendocrine functions, immune regulation and cardiovascular and respiratory function. 

Met-enkephalin enhances the immune reaction in patients with cancer or AIDS. It can bind 

35 opoid receptors present in peripheral inflamed tissues to mediate an analgesic effect. 

After exogenous administration of the different enkephalins, several immunologic functions 
are affected, including antibody production, NK cell activity against tumors and viral infections, 
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macrophage and polymorphonuclear leukocyte functions, graft rejections, and mitogen-stimulated 
lymphocyte proliferation. The effects can be bi-directional, where low concentrations enhance, and 
high concentrations inhibit the same immune function. Thus, enkephalins are modulators of 
immune reactions. 

5 These opioid neuropeptides are released by post-translational proteolytic processing of 

precursor proteins. These multivalent precursor proteins (polyprotein) consist of a signal sequence 
followed by a conserved region of about 50 residues, a variable length region and the sequence of 
the various neuropeptides. The preproenkephalin A (gene PENK) is processed to produce the 
following peptides which include Met-enkephalin (6 copies, 2 of which are extended) and Leu- 
10 enkephalin: 

Signal peptide 1-24 
Peptide 100-104 Met-enkephalin 1 
Peptide 107-1 1 1 Met-enkephalin 2 
Peptide 136-140 Met-enkephalin 3 
1 5 Peptide 186-193 Met-enkephalin-arg-gly-leu 

Peptide 210-214 Met-enkephalin 4 
Peptide 230-234 Leu-enkephalin 
Peptide 261-267 Met-enkephalin-arg-phe 

The conserved region in the N-termini of these precursors contains six cysteines that are 
20 probably involved in disulfide bonds. This region could also be important for the processing of the 
neuropeptides. 

The precursor protein does have the potential to be differentially cleaved into multiple 
extended enkephalin and non-enkephalin-containing peptides, the functions of which are largely 
unknown; however, in some cases it has been shown that extended enkephalin-containing peptides 
25 have enhanced opiate activity. Another peptide, enkelytin, is produced that exhibits anti-bacterial 
activity (see below). 

There is a growing body of evidence that proenkephalin exists largely independently of free 
enkephalin peptides in a number of tissues and cell types including astrocytes (Melner et al, EMBO 
J, 9, 791-796, 1990; Spruce et al, EMBO J 9, 1787-1795, 1990, the disclosures of which are 

30 incoporated herein by reference in their entireties), and is released from these cells in an 

unprocessed form (Batter et al, Brain Res. 563, 28-32, 1991, the disclosure of which is incorporated 
herein by reference in its entirety). There is evidence in some cases that processing enzymes are co- 
released along with the unprocessed precursor which suggests that extracellular cleavage may occur 
(Vilijn et al, J. Neurochem. 53, 1487-1493, 1989). Even if biological activity is signalled through 

35 binding of the small peptide products to cell surface receptors, the regulation of this activity may be 
mediated through the precursor, and it is also possible that the unprocessed precursor has an 
additional intracellular role of its own. 
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This protein was originally described to be present in various brain regions, most notably in 
the striatum as well as in neuroendocrine tissues, the pituitary and adrenal gland. It is also 
expressed in a variety of immune cells, including ConA-stimulated CD4 Tlymphocytes, CD4 
thymocytes, B lymphocytes, as well as T cell lines, macrophages and mast cells. Expression has 
5 been reported in the reproductive system, heart and many developing tissues during gestation and 
early postnatal period Because of this, it has been postulated that these peptides play a role in cell 
or tissue growth and differentiation. For example, endogenous enkephalins induced in thymocytes 
modulate their own expression and function to inhibit the proliferation of activated thymocytes. 
Enkephalin peptides are abundant in adrenal medulla and can be released by 
10 neurotransmitters specific for that tissue. Enkephalins have also been found to be abundant in 
human phaeochromocytoma, a tumour derived from the adrenal medulla. The RNA from this 
tumour contains a high level of enkephalin mRNA sequences as demonstrated by cell-free 
translation studies. 

Enkephalins function as opiate receptors are classified as delta, kappa and mu. A study by 

15 Lord et al (Nature, 267, 495-499, 1977) compared the activity of morphine and enkephalins in 
bioassay systems, and found that enkephalins bound predominantly to delta receptors. Subsequent 
studies have revealed homology of these receptors to other receptor families, including the 
immunoglobulin superfamily member OBCAM (Schofield et al, EMBO J 8, 489-495, 1989, the 
disclosure of which is incorporated herein by reference in its entirety) and somatostatin receptors 

20 (PCT publication WO96/06863, the disclosure of which is incorporated herein by reference in its 
entirety). This would explain the reported opioid binding properties of the former. Because of the 
latter' s homology to opiate receptors, it would also be expected to bind opioid receptor ligands. 
The recognition of opioid peptides by other non-opiate related receptors implies that these peptides 
may exert other as yet unknown functions. 

25 Enkephalins are also involved in apoptosis. Apoptosis is the morphologically distinct 

process of controlled cell death which balances the process of cell production by mitosis. A 
molecular connection between control of cell production and cell elimination has now been 
established, including the roles of c-myc and p53 in the pathways mediating apoptotic cell death. It 
has been proposed that all mammalian cells may be programmed to die by default in the absence of 

30 continuous signalling from neighboring cells. However, the acquisition of a survival advantage 
which prevents a single cell from activating its suicide program in response to levels of genetic 
damage associated with common environmental insults could theoretically be an initiating event in 
oncogenesis since it would favor the persistence of potentially tumorigenic mutations. 
Alternatively, inappropriate activation of survival pathways might lead to overriding the intrinsic 

35 death program and promote tumori genesis at early and late stages. A particularly potent oncogenic 
pathway would be one which both promoted and tolerated genetic damage and helped a cell 
overcome its need for extracellular survival signals. Approximately 50% of human tumors possess 
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normal p53 function. Thus, additional pathways or molecules which inappropriately repress 
apoptosis in human tumours remain to be identified. Opioid-like molecules could be involved in 
such a pathway. 

There are published reports that pathways which include opioid-like molecules participate 
5 in regulating the equilibrium between cell death and survival. For example, morphine inhibits cell 
survival in the developing cerebellum (Hauser et al, Exp. Neurol, 130, 95-105, 1994, the disclosure 
of which is incorporated herein by reference in its entirety) and induces apoptosis in thymocytes 
(Fuchs and Pruett, J. Pharmacol. Exp. Ther. 266, 417-423, 1993, the disclosure of which is 
incorporated herein by reference in its entirety). 

10 In a series of experiments (PCT publication WO 96/06863), it has been found that 

proenkephalin and/or its proteolytic products act as extracellular and/or cell surface membrane 
bound factors which modulate cell survival in transformed cells a) upon deprivation of exogenous 
survival factors, and b) following genotoxic injury and/or stress when exogenous survival factors 
are non-limiting. The receptor(s) to which these factor(s) bind, which are most likely to exist on the 

15 cell surface are related, or possibly identical, to one or more members of the opioid receptor family. 
Opioid-like receptor types or subtypes can mediate survival or death; receptor (s) 
whichmediate death appear to be coupled to those which mediate survival. Natural ligands for these 
receptors are likely to be products of the opioid precursor genes, although natural ligands could 
include cytokines which mimic their effect. Tumour cells are more sensitive to antagonism of 

20 opioid-like receptor-mediated survival, and to stimulation of opioid-like receptor-mediated death, 
than non-transformed cells. The induction of cell cycle arrest enhances the sensitivity of tumour 
cells to thesemanipulations. (Enhanced sensitivity of tumour cells to these manipulations is induced 
by their synchronisation within the cell cycle. 

Cytoplasmic proenkephalin and/or its proteolytic products act as general repressors of 

25 apoptosis. Agents which, if coupled to appropriate internalisation agents, would antagonise 
cytoplasmic proenkephalin would therefore be of use in the induction of apoptosis in 
non-transformed as well as transformed cells, particularly in combination with sublethal doses of 
known apoptosis-inducing agents. 

The repression of apoptosis mediated through cytoplasmic proenkephalin is activated at 

30 high cell density predominantly by nondiffusable factors. Inhibition of proenkephalin or its products 
as described above would therefore be potentiated if agents were used in combination for example 
with neutralising antibodies to integrins (such as the antibody 23C6- Bates et al., J. Cell Biol. 125 
403-415, 1994) to reduce exogenous survival signaling and simulate low density. 

Proenkephalin targeted to the cell nucleus induces apoptotic death, which is inhibited by the 

35 overexpression of large T antigen and is at least partly mediated through p53. Tumors which retain 
wild-type p53 function are therefore a particular target for apoptosis induction by agents which 
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increase the levels of proenkephalin, or its derivatives, within the nucleus or which mimic the 
function of nuclear proenkephalin or its derivatives. 

Accordingly, the protein of SEQ ID NO:279, fragments thereof, or nucleic acids encoding 
the protein of SEQ ID NO:279 may be used to modulate a biochemical pathway in which products 
5 of opioid peptide precursor genes participate. In some embodiments, antibodies or other agents 
which reduce the level or activity of the protein of SEQ ID NO:279 or fragments thereof may be 
used to induce apoptosis in cells. The agents preferably neutralize the protein of SEQ ID NO:279 
or its proteolytic derivatives, increase the level of, activate or mimic nuclear proenkephalin, or act 
as an antagonist to receptors related or identical to the delta and kappa opioid receptors. In some 

10 embodiments, the agent may be a neutralizing monoclonal antibody against the protein of SEQ ID 
NO:279 or a fragment thereof. The agent may also be a fragment or allelic form of one of these 
antibodies. A cytoplasmic anchor, or a nuclear localization signal may also be included in the 
agent. In some embodiments, the agent is able to modulate a biochemical pathway in a cell in 
which products of opioid peptide precursor genes participate in order to induce apoptosis. The 

15 agents can be used for the treatment of cancer or for inducing apoptosis in lens cells following a 
cataract operation. In some embodiments, the agents promote apoptosis of proliferating cells with 
less, or no, effect on normal mature cell types. The agents may be administered in combination 
with a genotoxic or cell cycle arrest agent. Alternatively, the agent may be complexed with a 
chemotherapeutic, irradiation or cell cycle arrest (synchronization agent). 

20 Accordingly, the invention provides a means of inducing apoptosis in cells which comprises 

modifying a biological pathway of a cell in which a product of an opioid precursor gene participates 
in such a way that apoptosis is induced. Modification of the pathway is suitably effected by 
adminstration of an appropriate agent. In particular, the present invention provides an agent for use 
in inducing apoptosis in cells, said agent comprising an agent able to neutralise proenkephalin or its 

25 proteolytic derivatives; an agent which increases the level of nuclear proenkephalin and/or its 
derivatives, or which activates or mimics them an agent which acts as an antagonist at receptor(s) 
related or identical to the delta opioid receptor, or an agent which acts as an agonist at receptor(s) 
related or identical to the kappa opioid receptor. 

A subset of such agents are agents able to neutralise proenkephalin or its proteolytic 

30 derivatives, or an agent which acts as an antagonist at receptor(s) related or identical to the delta 
opioid receptor, or an agent which acts as an agonist at receptor(s) related or identical to the kappa 
opioid receptor. 

In some embodiments, the agent may be administered to the cell surface whereupon the 
survival effects of extracellular and/or cell surface membrane bound proenkephalin or its proteolytic 
35 derivatives is neutralised causing the cell to become apoptotic. Alternatively, an agent able to 

neutralise proenkephalin or its proteolytic derivatives may be coupled to an internal isati on peptide 
and a cytoplasmic anchor. Such an assembly will remain in the cytoplasm of the cell, antogonising 
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cytoplasmic proenkephalin and/or its proteolytic products and thus neutralising the apoptosis 
repressor effect of these molecules. 

Enkephalins also have anti-bacterial activity. During processing of the proenkephalin-A, 
the maturation in the adrenal medullary chromaffin cell starts with the removal of the carboxy- 
5 terminal end (proenkephalin-A-derived peptide or PEAP209-239) (Y. Goumon, K. Lugardon, B. 
Kieffer et al. J. Biol. Chem. 273:29847-29856, 1998, the disclosure of which is incorporated 
herein by reference in its entirety). The peptide enkelytin was identified as corresponding to 
bisphosphorylated PEAP209-237, and possesses antibacterial activity including Staphylococcus aureus 
and other gram-positive bacteria such as Micrococcus luteus and Bacillus megaterium (0.2-0.4 uM 

10 range). There is no ability to affect gram-negative bacteria (E. coli strain D22, D31, 663 and 
T13773) growth, nor is there any hemolytic activity. The activity of this peptide is specific - 
shorter versions of the peptide (209-220, 224-237, 230-237, 233-237) or non-phosphorylated 
PEAP209-239 exhibited little to no bacterial growth inhibiting activity. 

Bovine periarthritis abscess fluid contains different forms of PEAP (72-237/239; 80- 

15 237/239) as identified by immunoreactivity and confirmed by sequence analysis. These peptides 
have activity against M. luteus, but are less active than enkelytin (5 versus 0.2 uM). These PEAP 
constitute a pool of precursors which have to be processed, during infection, to provide active 
enkelytin. Presence of a PEAP at a molecular mass corresponding to that of PEAP209-237 was 
detected as well. PEAPs (PEAP202-238 and PEAP206-237) have also been detected in wound fluids, 

20 including bovine post-caesarean abscess in the subcutaneous lining, and an abscess induced by 
subcutaneous injection of complete Freund's adjuvant. Therefore, these peptides are present in 
wound fluids along with other known antibacterial peptides (defensins, bactenecins). The 
concentrations were in a range similar to that found to be active in vitro (0.5-1 uM). The PEAPs 
have also been detected in secretions from human polymorphonuclear neutrophils. 

25 The PEAP209-230 and enkelytin are secreted from cultured chromaffin cells following 

stimulation. This suggests that these two peptides are co-released with catecholamines in stress 
situations and may therefore play an important role in defense mechanisms. 

Co-release of met-enkephalin and enkelytin would represent a unified neuroimmune 
protective response to stress situations that may be accompanied with infectious diseases. This 

30 would provide a highly beneficial survival strategy at the very begninning of proinflammatory 
processes. This protein would therefore play an important role in host defense against microbial 
infections, especially those involving gram positive bacteria. Due to their nonspecific activity on 
membranes, the antibacterial peptides possess cytotoxic activities and may not only play a role in 
antimicrobial defense, but also in inflammatory processes, possibly in wound repair. 

35 The protein of SEQ ID NO:279, peptides derived by cleavage thereof or fragments thereof 

could be used as antibacterial agents in creams/ointments/solutions, presoaked bandages, or dermal- 
type patches for external applications. Alternatively, the protein of SEQ ID NO:279, peptides 
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derived by cleavage thereof, or fragments thereof may be used in injections (intravenously, 
subcutaneously or intra-peritoneally). This is useful for wound repair, burn healing, post-operative 
recovery management. 

Alternatively, the protein of SEQ ID NO:279, peptides derived by cleavage thereof, or 
5 fragments thereof, may be incorporated into disinfectant solutions used for cleaning surfaces such 
as in the the house (kitchen, bathroom) or in the office (desktops, phones, computer keyboards and 
mouse). Other applications are as additives in mouthwash or handi-popup wipes. 

Altered levels of enkephalins may produce psychological disease. Konig et al (Nature, 383, 
535-538, 1996, the disclosure of which is incorporated herein by reference in its entirety) used a 

10 genetic approach to study the role of the mammalian opioid system. They disrupted the pre- 
proenkephalin gene using homologous recombination in embryonic stem cells to generate 
enkephalin-deficient mice. Mutant enk -/- animals are healthy, fertile, and care for their offspring, 
but display significant behavioral abnormalities. Mice with the enk -/- genotype are more anxious 
and males display increased offensive aggressiveness. Mutant animals show marked differences 

15 from controls in supraspinal, but not in spinal, responses to painful stimuli. These enk -/- mice do 
however exhibit normal stress-induced analgesia. Therefore, enkephalins modulate responses to 
painful stimuli. Thus, genetic factors may contribute significantly to the experience of pain. This 
study clearly indicates the importance of enkephalins in pain perception, anxiety and 
aggressiveness. 

20 Interestingly, the PENK gene is localized on 8q23-q24, the same locus on which are found 

genes related to epilepsy and spastic paraplegia, disorders related to brain dysfunction. 

Accordingly, the protein of SEQ ID NO:279 or fragments thereof may be used for the 
treatment of psychological disorders, especially those involving distortion in the perception of pain, 
aggressiveness, or anxiety. This would include drug addiction, different types of phobias, panic 

25 attacks, schizophrenia, bi-polar, anorexia nervosa, chronic pain disorders, post-traumatic events, 
post-operative pain management. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:279, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

30 ameliorate a condition in an individual. For example, the condition may be cancer, a condition 
resulting from increased or decreased cellular proliferation, bacterial infection, conditions resulting 
from abnormal immune responses, psychological disease or any of the conditions listed above. In 
such embodiments, the protein of SEQ ID NO:279, or a fragment thereof, is administered to an 
individual in whom it is desired to increase or decrease any of the activities of the protein of SEQ 

35 ID NO:279. The protein of SEQ ID NO:279 or fragment thereof may be administered directly to 
the individual or, alternatively, a nucleic acid encoding the protein of SEQ ID NO:279 or a 
fragment thereof may be administered to the individual. Alternatively, an agent which increases the 
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activity of the protein of SEQ ID NO:279 may be administered to the individual. Such agents may 
be identified by contacting the protein of SEQ ID NO:279 or a cell or preparation containing the 
protein of SEQ ID NO:279 with a test agent and assaying whether the test agent increases the 
activity of the protein. For example, the test agent may be a chemical compound or a polypeptide 
5 or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:279 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:279 may be identified by contacting the protein of 
SEQ ID NO:279 or a cell or preparation containing the protein of SEQ ID NO:279 with a test agent 

10 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 

15 example, fetal brain, or to distinguish between two or more possible sources of a sample on the 
basis of the level of the protein of SEQ ID NO:279 in the sample. For example, the protein of SEQ 
ID NO:279 or fragments thereof may be used to generate antibodies using any techniques known to 
those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples, differentiated tumor tissue that 

20 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from fetal brain or tissues other than fetal brain to determine whether the test sample is from fetal 

25 brain. Alternatively, the level of the protein of SEQ ID NO:279 in a test sample may be measured 
by determining the level of RNA encoding the protein of SEQ ID NO:279 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 
Northern blots, dot blots or other technques familiar to those skilled in the art. If desired, an 
amplification reaction, such as a PCR reaction, may be performed on the nucleic acid sample prior 

30 to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
fetal brain or tissues other than fetal brain to determine whether the test sample is from fetal brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:279, 
including using methods known to those skilled in the art. For example, an antibody against the 

35 protein of SEQ ID NO:279 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:279 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:279 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ID NO:279. In such techniques, the level of the protein of SEQ ID NO:279 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of SEQ ID NO:279 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO:279 which is associated 
with disease. 

10 Protein of SEP ID NO: 293 (internal designation 181-1 6-1 -0-G7-CS) 

The protein of SEQ ID NO: 293 has a high degree of homology with HSPC163 (Genbank 
accession number AF161512), the protein encoded by gene no: 93 (PCT/US99/17130) and the 
human cornichon protein TGAM77. SEQ ID NO: 293 is overexpressed in cancerous prostate, fetal 
brain and fetal kidney. 

15 The gene HSPC163 is one of three hundred cDNAs obtained from CD34+ hematopoietic 

stem / progenitor cell (HSPC) library (obtained from umbilical cord blood and adult bone marrow). 
HSPC163 has also been in identified in five hematopoietic cell lines: NB4 (granulocytic), HL60 
(granulocytic), U937 (monocytic), K562 (erythro-megakaryocytic), and Jurkat (T lymphocytic). 
These cell lines represent the distinct lineages of hematopoietic cells. 

20 The polypeptide of gene no: 93 has been determined to have two transmembrane domains 

and a short cytoplasmic tail. Based upon these characteristics, it is believed that the protein product 
of gene no: 93 shares structural similarity to type Ilia membrane proteins. This gene is expressed 
primarily in activated T-cells and to a lesser extent in endometrial tumor, T cell helper II cells, 
microvascular endothelial cells, Raji cells treated with cyclohexamide and umbilical vein 

25 endothelial cells. The expression pattern of gene no: 93, indicates a role in regulating the 
proliferation, survival, differentiation, and/or activation of hematopoietic cell lineages, including 
blood stem cells. The gene product appears to be involved in the regulation of cytokine production, 
antigen presentation, and other immune processes, suggesting a usefulness in boosting the immune 
system. The translation product of this gene has high homology to the human TGAM77 and mouse 

30 cornichon proteins. 

TGAM77 was identified as a gene involved in early phase of T-cell activation in response 
to alloantigens. Twenty four hours after T-cell allostimulation, RNA expression of TGAM77 is 
significantly increased. TGAM77 has been designated as a T-cell growth associated molecule. 
TGAM77 is a human homolog of cornichon (cni) protein of the fruit fly Drosophila. 

35 Cornichon was demonstrated to be involved in carefully orchestrated signaling events 

during Drosophila oogenesis establishing an asymmetric pattern in the oocyte as a prerequisite for 
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correct embryogenesis. Cornichon signaling functions in concert with two other proteins. The 
function of all three genes in an EGF-like signaling pathway appears to direct the formation of a 
correctly polarized microtubule cytoskeleton, which is thought to be the basis for the correct spatial 
localization of other singaling molecules essential for oocyte polarization, asymmetric movement 
5 of the nucleus, and embryo differentiation. 

The subject invention provides the amino acid sequence of SEQ ID NO: 293 and 
polynucleotide sequences encoding the amino acid sequence of SEQ ID NO: 293 . In one 
embodiment, the polypeptides of SEQ ID NO: 293 are interchanged with the corresponding 
polypeptides encoded by the human cDNA of clone 181-16-1-0-G7-CS. Also included in the 

10 invention are biologically active fragments of SEQ ID NO: 293 and polynucleotide sequences 
encoding these biologically active fragments. "Biologically active fragments" are defined as those 
peptide or polypeptide fragments of SEQ ID NO: 293 which have at least one of the biological 
functions of the full length protein (e.g., the ability to stimulate T-cell proliferation). 

The invention also provides variants of SEQ ID NO: 293 . These variants have at least 

15 about 80%, more preferably at least about 90%, and most preferably at least about 95% amino acid 
sequence identity to the amino acid sequence of SEQ ID NO: 293. Variants according to the 
subject invention also have at least one functional or structural characteristic of SEQ ID NO: 293, 
such as the biological functions described above. The invention also provides biologically active 
fragments of the variant proteins. Unless otherwise indicated, the methods disclosed herein can be 

20 practiced utilizing SEQ ID NO: 293 or variants thereof. Likewise, the methods of the subject 
invention can be practiced using biologically fragments of SEQ ID NO: 293, or variants of said 
biologically active fragments. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode SEQ ID NO: 293 . It is well within the skill of a person trained in the art to create these 

25 alternative DNA sequences which encode proteins having the same, or essentially the same, amino 
acid sequence. These variant DNA sequences are, thus, within the scope of the subject invention. 
As used herein, reference to "essentially the same" sequence refers to sequences that have amino 
acid substitutions, deletions, additions, or insertions that do not materially affect biological activity. 
Fragments retaining one or more characteristic biological activity of SEQ ID NO: 293 are also 

30 included in this definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 
code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 

35 viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

SEQ ID NO: 293 protein, and variants thereof, can be used to produce antibodies according 
to methods well known in the art. The antibodies can be monoclonal or polyclonal. Antibodies can 
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also be synthesized against fragments of SEQ ID NO: 293 as well as variants of SEQ ID NO: 293 
according to known methods. The subject invention also provides antibodies which specifically 
bind to biologically active fragments of SEQ ID NO: 293 or biologically active fragments of SEQ 
ID NO: 293 variants. 

5 The subject invention also provides for immunoassays which are used to screen for, 

monitor, or diagnose prostate cancer. Methods of screening for, diagnosing, identifying, or 
monitoring the course of prostate cancer are well known to those skilled in the art. In this aspect of 
the invention, immunoassays are provided which contact a biological sample (e.g., blood, serum, 
tissue, or biopsied tissue sample) with antibodies which specifically bind to SEQ ID NO: 293 , 

10 immunogenic fragments of SEQ ED NO: 293 , or biologically active fragments of SEQ ID NO: 293 
. Immunocomplexes formed in the contacting step are then detected using an appropriately labeled 
detection reagent. The levels of SEQ ED NO: 293 expressed in the tested biological samples are 
compared to control/normal levels typically observed in the population. 

Alternatively, methods which screen for, monitor, or diagnose prostate cancer may be 

15 practiced with SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 , as well as nucleic acids 
encoding SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 . In one embodiment, the 
polypeptide may be used as a standard/control immunoassays described above. In another 
embodiment, the nucleic acids encoding SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 are 
used in hybridization assays, well known to the skilled artisan, to identify biological samples (e.g., 

20 blood, serum, tissue, or biopsied tissue sample) which contain SEQ ID NO: 293 . The levels of 
SEQ ID NO: 293 expressed in the tested biological samples are compared to control/normal levels 
typically observed in the population. 

In another embodiment, SEQ ID NO: 293 , and polynucleotide sequences encoding the 
amino acid sequence of SEQ ID NO: 293 can be used to identify or diagnose immune disorders 

25 involving activated T-cells using standard hybridization assays. 

Another aspect of the invention provides methods of immunostimulating a mammal. In this 
aspect of the invention, SEQ ID NO: 293 , and/or polynucleotide sequences encoding the amino 
acid sequence of SEQ ID NO: 293 , are introduced into T-cells according to well known methods. 
T-cells are, then activated by stimulation with antigen to induce the immune system of the mammal. 

30 In another embodiment, autologous T-cells are obtained from an individual. SEQ ID NO: 

293 , biologically active fragments thereof, and/or polynucleotide sequences encoding the amino 
acid sequence, or biologically active fragments, of SEQ ID NO: 293 , are introduced into these 
autologous T-cells according to well known methods. The T-cells are expanded and reintroduced 
into the individual from which the T-cells were obtained. See, for example U.S. Patent Nos. 

35 5,192,537 and 5,766,920 , hereby incorporated by reference in their entirety. 

In another embodiment of the subject invention, polynucleotides and polypeptides 
encoding SEQ ID NO: 293 , can be used to expand stem cells, committed progenitors of various 
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blood lineages, and in the differentiation and/or proliferation of various cell types. In this aspect of 
the invention, polynucleotides and polypeptides encoding SEQ ID NO: 293 are introduced into the 
cells and the cells cultured. These methods may be practiced according to methods well known to 
the routineer. 

5 Protein of SEQ ID NO:316 (internal designation 1 88-45- 1-0-D9-CS) 

The protein of SEQ ID NO:3 16 is encoded by the cDNA of SEQ ID NO:75. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:3 16 
described throughout the present application also pertain to the polypeptide encoded by a nucleic 
acid included in clone 1 88-45-1 -0-D9-CS. In addition, it will be appreciated that all characteristics 

10 and uses of the nucleic acid of SEQ ID NO:75 described throughout the present application also 
pertain to the nucleic acid included in clone 1 88-45-1 -0-D9-CS. 

The protein of SEQ ID NO:3 16 is expressed in brain and contains three membrane- 
spanning segments located between amino acid positions 6 and 26, 73 and 93, or 139 and 159 and a 
signal peptide comprising the sequence FAAFCYMLSLVLC/AA. Accordingly, one embodiment 

15 of the present invention is a polypeptide comprising one or more of the membrane-spanning 
segments, and/or the signal peptide. 

The protein of SEQ ID NO:316 is a member of the cornichon protein family. It has 48% 
identity with the Drosophila melanogaster cornichon protein as well as 67% identity with the 
Human Cornichon homolog TGAM77 (Genbank accession No. AF 104398, the disclosure of which 

20 is incorporated herein by reference in its entirety), 67% identity with hCornichon, a bone marrow 
secreted protein (PCT publication WO/9933979, the disclosure of which is incorporated herein by 
reference in its entirety), 67% identity with a human secreted protein encoded by gene 24 (PCT 
publication WO/9910363, the disclosure of which is incorporated herein by reference in its entirety) 
and 67% identity with the protein product of the mouse cnih gene. However, this protein has higher 

25 homology, 81% identity, to the mouse cornichon-like protein (Genbank accession No. AB006191, 
the disclosure of which is incorporated herein by reference in its entirety), which is the product of 
the mouse cnil gene. Finally, the protein of SEQ ID NO:316 has a high level of identity with 
human secreted protein encoded by gene 95 (GSP:Y76218, PCT publication WO/9958660, the 
disclosure of which is incorporated herein by reference in its entirety) and is likely a polymorphic 

30 varient of gene 95. The high degree of sequence conservation between the members of this family 
indicates that they are under strong selective pressure and are likely involved in important cellular 
functions. 

The Drosophila cornichon (cni) gene product is involved in signaling processes necessary 
for both anterior-posterior and dorsal-ventral pattern formation during Drosophila embyrogenesis 
35 (Cell, 1995, 81 :967-978). Mutations in cornichon prevent the formation of a correctly polarized 
microtubule cytoskeleton in the oocyte. Cni signaling functions in concert with two other proteins. 
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Gurken, which is a protein secreted from the oocyte containing a single epidermal growth factor 
(EGF) motif most similar in structure to vertebrate TGFa, is considered to be the ligand of the 
Drosophila epidermal growth factor receptor (DER) homolog torpedo, which is expressed by the 
follicular epithelium. The function of all three genes in an EGF-like signaling pathway appears to 
5 direct the formation of a correctly polarized microtubule cytoskeleton, which is thought to be the 
basis for the correct spatial localization of other signaling molecules essential for oocyte 
polarization, asymmetric movement of the nucleus, and embryo differentiation. TGAM77, one of 
the human homologs of cornichon, is differently expressed in alloactivated T-cells (Bioch. Biophys. 
Acta 1999, 1449:203-210, the disclosure of which is incorporated herein by reference in its 
10 entirety). Since there is a well-known involvement of the microtubule cytoskeleton in spatial 
polarization of signaling events in T-cell activation, it is thought that TGAM77 may function in a 
protein-tyrosine kinase pathway required for the vectorial localization of signaling molecules in T- 
cell activation. 

The protein of SEQ ID NO:316 is found in brain tissue, and gene 95 (GSP:Y76218, PCT 

15 publication WO/9958660, the disclosure of which is incorporated herein by reference) is expressed 
in infant brain tissue, endometrial tumor tissue and fontal cortex tissue. ESTs matching this gene 
are also found in lung tissue, germ cell tumors and skin melanomas. This is similar to the 
expression pattern of the murine cnil gene, which is found in 6.5-day whole embryos, 1 1.5-day limb 
bud, 13.5-day whole embryo, adult lung and brain (Dev. Genes Evol., 1999, 209:120-125, the 

20 disclosure of which is incorporated herein by reference in its entirety). 

Polynucleotides encoding the protein of SEQ ID NO:316 or fragments thereof and 
polypeptides comprising the protein of SEQ ID NO:316 or fragments thereof are useful as reagents 
for differential identification of the tissue(s) or cell type(s) present in a biological sample and for 
diagnosis of diseases and conditions which include, but are not limited to, endometrial tumor, and 

25 neural and developmental diseases and/ or disorders. Similarly, the protein of SEQ ID NO:3 16 or 
fragments thereof and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the neural and reproductive organs, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain tissues or cell 

30 types (e.g., neural, reproductive, cancerous and wounded tissues) or bodily fluids (e.g. lymph, 

serum, plasma, urine, amniotic fluid, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in infant brain tissue and adult brain tissue, as well as the homology 

35 to cornichon proteins, indicates that polynucleotides encoding the protein of SEQ ED NO:3 16 or 
fragments thereof and polypeptides comprising the protein of SEQ ID NO:316 or fragments thereof 
are useful for detecting and/or treating neural and developmental disorders. The tissue distribution 
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indicates that these polynucleotides and polypeptides are useful for the detection/treatment of 
neurodegenerative disease states and behavioural disorders such as Alzheimers Disease, Parkinsons 
Disease, Huntingtons Disease, Tourette Syndrome, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder, panic disorder, learning disabilities, ALS, Psychoses, autism, and 
5 altered behaviors, including disorders in feeding, sleep platterns, balance, and perception. In 
addition, the gene or gene product may also play a role in treatment and/or detection of 
developmental disorders associated with the developing embyo, or sexually-linked disorders, 

Elevated expression of the protein of SEQ ID NO:3 16 within the brain suggests that it may 
be involved in neuronal survival, synapse formation, conductance, neural differentiation, etc. Such 

10 involvment may impact many processes, such as learing and cognition. Alternatively, the tissue 
distribution in endometiral tumor tissue, germ cell tumors and skin melanomas indicates that the 
translation product of this gene is useful for the detection and/or treatment of endometrial tumors 
and/or reproductive disorders, as well as tumors of other tissues where expression of this gene has 
been observed. Furthermore, the protein of SEQ ID NO:3 16 or fragments thereof may also be used 

15 to determine biological activity, to raise antibodies, as a tissue marker, to isolate cognate ligands or 
receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. The protein of SEQ ID NO:316 or fragments thereof, as well as, antibodies directed 
against the protein may be used as tumor marker and/or immunotherapy targets for the above listed 
tissues. 

20 The gene encoding the protein of SEQ ID NO:3 1 6 is thought to reside on chromosome 1 1 - 

Accordingly, polynucleotides encoding the protein of SEQ IDNO:316 or fragments thereof are 
useful as a marker in linkage analysis for chromosome 11. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:316 , 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 

25 consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition in an individual. For example, the condition may be an abnormality in 
development, a signaling pathway, microtubule construction, neuronal survival, synapse formation, 
conductance, neuarl differentiation, or it may be cancer or an abnormality in any of the functions 
listed above. In such embodiments, the protein of SEQ ID NO:316, or a fragment thereof, is 

30 administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ID NO:316. The protein of SEQ ID NO:316 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ID NO: 3 16 or a fragment thereof may be administered to the individual. Alternatively, an agent 
which increases the activity of the protein of SEQ ID NO:3 16 may be administered to the 

35 individual. Such agents may be identified by contacting the protein of SEQ ID NO:3 16 or a cell or 
preparation containing the protein of SEQ ID NO:316 with a test agent and assaying whether the 



345 



WO 01/42451 PCT/IB00/01938 

test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:316 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
5 with the activity of the protein of SEQ ID NO:3 1 6 may be identified by contacting the protein of 
SEQ ID NO:3 16 or a cell or preparation containing the protein of SEQ ID NO:316 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

10 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, or to distinguish between two or more possible sources of a sample on the basis of 
the level of the protein of SEQ ID NO:3 16 in the sample. For example, the protein of SEQ ID 
NO:316 or fragments thereof may be used to generate antibodies using any techniques known to 

15 those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples, differentiated tumor tissue that 
has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
may be detectably labeled, under conditions which facilitate antibody binding. The level of 

20 antibody binding to the test sample is measured and compared to the level of binding to control cells 
from brain or tissues other than brain to determine whether the test sample is from brain. 
Alternatively, the level of the protein of SEQ ID NO:3 16 in a test sample may be measured by 
determining the level of RNA encoding the protein of SEQ ID NO:316 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 

25 Northern blots, dot blots or other technques familiar to those skilled in the art. If desired, an 

amplification reaction, such as a PCR reaction, may be performed on the nucleic acid sample prior 
to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
brain or tissues other than brain to determine whether the test sample is from brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 

30 used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:3 16, 
including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:3 16 or a fragment thereof may be fixed to a solid support, such as a 
chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:3 16 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 

35 support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 
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In another embodiment of the present invention, the protein of SEQ ID NO:316 or a 
fragment thereof may be used to diagnose disorders associated with altered expression of the 
protein of SEQ ID NO:316. In such techniques, the level of the protein of SEQ ID NO:316 in an ill 
individual is measured using techniques such as those described herein. The level of the protein of 
5 SEQ ID NO:316 in the ill individual is compared to the level in normal individuals to determine 
whether the individual has a level of the protein of SEQ ID NO:316 which is associated with 
disease. 

Protein of SEP ID NO:255 (1 06-037- l-0-E9-CS.cor) 

The protein of SEQ ID NO:255, encoded by the cDNA of SEQ ID NO: 14, is strongly 
10 expressed in the liver and testis and shows extensive homology to human lactate dehydrogenase-A 
protein (LDH-A or M chain) (Chung F.Z. et al., Biochem. J. 231:537-541(1985); SwissProt 
accession number P00338). The protein of SEQ ID NO:255 is also homologous to lactate 
dehydrogenase A from many vertebrates. The 38 1-amino-ac id-long protein of SEQ ID NO:255 
displays a Prosite motif corresponding to lactate dehydrogenase from positions 71 to 380. In 
15 addition, the active site LGEHGDS, where H is the active site residue, is present in the protein of 
the invention (positions 239 to 245). The protein of the invention also contains an additional 50 N- 
terminal amino acids not found in other lactate dehydrogenase A proteins. This N-termimal 
extension contains a signal peptide (cleavage site at position 34 of the protein of invention) that may 
allow the export of the protein to the extracellular domain or define a particular subcellular 
20 localization. Alternatively, the initiation start codon could be at position 26 or 50 of the protein of 
SEQIDNO:255. 

Lactate dehydrogenase (LDH) is an enzyme which dehydrogenates lactic acid into 
pyruvic acid in conjunction with the hydrogen acceptor NAD+, and which exists in a wide 
variety of animal tissues and microorganisms as an enzyme serving to produce lactic acid 

25 from pyruvic acid in the glycolytic pathway (Abad-Zapatero C. et al. J. MoL Biol. 198:445- 
467(1987)). It is known that in vertebrates there are three isozymes of LDH: the M form 
(LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart 
muscle, and the X form (LDH-C), found only in the spermatozoa of mammals and birds. 
In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as 

30 epsilon-crystallin (Hendriks W. et al. Proc. Natl. Acad. Sci. U.S.A. 85:71 14-71 18(1988)). 

LDH has been used extensively in the field of clinical test reagents for a number of 
purposes. For example, it has been used as a coupling enzyme to determine the enzymatic 
activity of various amino-transferases, such as alanine aminotransferase (ALT), which is 
ultimately detected by UV spectrometry of the produced pyruvic acid. This use of LDH 

35 has been widely adopted as a clinical test, because amino-transferases are enzymes which 
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show high activity in liver, heart, kidney, etc. and show remarkable increases in serum in 
association with various diseases. LDH has also been used as a coupling enzyme to help 
determine the level of substrates such as urea, as the enzyme promotes the conversion of 
such substances into pyruvic acid which can be detected by UV spectrometry. 
5 Lactate dehydrogenase is also a widely used marker for heart disease and other 

conditions. For example, levels of LD-1 are elevated in the presence of myocardial 
infarction and in other conditions such as leukemia. Levels of lactate dehydrogenase start 
to increase 24 to 48 hours after occlusion of the coronary artery, peak in 3 to 6 days, and 
return to normal in 8 to 14 days. In addition, levels of LD-1 are elevated 10 to 12 hours 

10 after the acute myocardial infarction, peak in 2 to 3 days, and return to normal in 

approximately 7 to 10 days. Thus, measurement of the level of lactate dehydrogenase 
allows a prolonged retrospective diagnosis of myocardial infarction. Further, while the 
amount of LD-2 in the blood is usually higher than the amount of LD-1, patients with acute 
myocardial infarction have more LD-1 than LD-2. This "flipped ratio" usually returns to 

15 normal in 7 to 10 days. An elevated level of LD-1 with a flipped ratio has a sensitivity and 
specificity of approximately 75% to 90% for detection of acute myocardial infarction. 

Elevated LDH levels have also been used as a prognostic indicator for cancers such 
as small cell lung carcinoma. Specifically, elevated levels of LDH indicate a poor 
prognosis for such diseases (Kawahara, et al., (1997) Jpn J Clin Oncol. 1997 Jun;27(3):158- 

20 65). 

LDH expression in cells has also been shown to be induced by interleukin-1 alpha, a 
major cytokine associated with, e.g., inflammation (Nehar et al. (1998) Biol Reprod 
Dec;59(6): 1425-32). 

Islet beta-cells express low levels of lactate dehydrogenase and have high glycerol 
25 phosphate dehydrogenase activity. The effects on glucose metabolism and insulin secretion 
of acute overexpression of the skeletal muscle isoform of lactate dehydrogenase (LDH)- A 
in these cells have been studied by Ainscow EK et al. (Diabetes 2000 Jul;49(7):l 149). The 
results of these studies have shown that overexpression of LDH activity interferes with 
normal glucose metabolism and insulin secretion in islet beta cells, and it may therefore be 
30 directly responsible for insulin secretory defects in some forms of type 2 diabetes. These 
results also reinforce the view that glucose-derived pyruvate metabolism in the 
mitochondria is critical for glucose-stimulated insulin secretion in beta cells. Other data 
show that an overexpression of lactate dehydrogenase A attenuates glucose-induced insulin 
secretion in stable MIN-6 beta-cell lines, which normally express low levels of L-lactate 
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dehydrogenase (Zhao C, Rutter GA FEBS Lett. 1998 Jul 3;430(3):213-6). Low LDH 
activity thus appears to be important in beta-cell glucose sensing. 

Analysis of the LDH isoenzyme pattern in CSF fluid has also been shown to be helpful in 
the evaluation of CNS involvement in patients with hematologic malignancies (Lossos IS, et al. 
5 Cancer. 2000 Apr 1;88(7): 1599-604). 

It is believed that the protein of SEQ ID NO:255 is a lactate dehydrogenase protein, most 
likely of the LDH-A or M subtype. The activity of the present protein can be assessed using any 
standard method for detecting lactate dehydrogenase enzyme activity, including those involving the 
UV detection of pyruvate, a product of LDH-catalyzed enzymatic reactions. 

10 In one embodiment, the polypeptides and polynucleotides of the invention are used to detect 

testis and liver tissue, as well as cells derived from these tissues. For example, nucleic acids and 
proteins of the invention can be labeled isotopically or chemically, using methods known to those 
skilled in the art, and used as probes in northern blots, far-western blots and in situ hybridization 
experiments. An ability to detect specific cell types is useful, e.g. for the determination of the 

15 history of tumor cells, as well as for the identification of cells and tissues for histological studies. 

In another embodiment, the present protein can be used in any of a variety of clinical assays 
involving LDH enzymes. For example, the protein can be used as a coupling enzyme to determine 
the enzymatic activity of various amino-transferases, such as alanine aminotransferase (ALT), as 
detected by UV spectrometry of the produced pyruvic acid. Such assays have significant clinical 

20 utility, as amino-transferases are enzymes which show high activity in liver, heart, kidney, etc. and 
show remarkable increases in serum in association with various diseases. The protein of the 
invention can also be used as a coupling enzyme to help determine the level of substrates such as 
urea, as the enzyme promotes the conversion of such substances into pyruvic acid which can be 
detected by UV spectrometry. 

25 In another embodiment, the present protein can be used to identify ingredients for cosmetic 

formulations. Specifically, enhancers of lactate dehydrogenase can be included in cosmetic 
compositions to stimulate keratinocyte proliferation and collagen synthesis in cutaneous tissues. 
The inhibitors can be combined with other active ingredients such as pyruvic acid, acetic acid, 
acetoacetic acid, beta-hydroxybutyric acid, Krebs cycle pathway metabolites, aliphatic saturated or 

30 unsaturated fatty acids containing from 8 to 26 carbon atoms, omega-hydroxy acids containing from 
22 to 34 carbon atoms, glutamic acid, glutamine, valine, alanine, leucine, and mixtures thereof (see, 
e.g., US Patent 5,853,742, the disclosure of which is hereby incorporated by reference in its 
entirety). 

In another embodiment, the present invention provides methods for treating or preventing 
35 cancer, e.g., by inhibiting lactate dehydrogenase activity in cells, preferably specifically the cancer 
cells, of a patient. The expression or activity of lactate dehydrogenase can be inhibited using any of 
a large number of agents, including, but not limited to, antibodies, antisense molecules, ribozymes, 
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and heterologous molecules that inhibit the expression or activity of the lactate dehydrogenase in 
the cancer cells of the patient. In one embodiment, lactate dehydrogenase that has been obtained 
from a primate, or anti-lactate dehydrogenase antibodies obtained from a mammal as a result of the 
parenteral administration of primate lactate dehydrogenase to said mammal, is parenterally 
5 administered to human cancer patients. Antibodies derived from the protein of the invention or part 
thereof can also be used to inhibit cancer cell development as described in US Patent No. 4,620,972. 

Analysis of the LDH isoenzyme pattern in CSF fluid has been shown to be helpful in the 
evaluation of CNS involvement in patients with hematologic malignancies (Lossos IS, et al. Cancer. 
2000 Apr 1;88(7): 1599-604). Thus, in another embodiment, the protein of SEQ ID NO:255 can be 

10 used to develop assays to monitor the LDH isoenzyme activity in CSF fluid, thereby improving the 
sensitivity of CSF cytology. This assay may be derived, e.g., from the methods described by Short 
S. et al. (J Biol Chem. 2000 Apr 28;275(1 7): 12963-9). 

In another embodiment, the protein of SEQ ID NO: 25 5 is used to detect and/or treat insulin 
secretory defects in some forms of type 2 diabetes. For example, various evidence indicates that 

15 LDH overexpression may be involved in certain types of diabetes. Therefore, the detection of an 
elevated level of LDH in a patient, e.g. in pancreatic islet cells of a patient, can be used as an 
indication that the patient has diabetes, or is at risk of developing diabetes. Similarly, methods of 
inhibiting the expression or activity of LDH in those cells, e.g. using antibodies, antisense 
sequences, or heterologous compounds that inhibit the expression or activity of LDH, can be used to 

20 treat or prevent diabetes. 

In another embodiment, the protein of the invention can be used to eliminate endogenous 
pyruvic acid in cells in vitro or in vivo. 

In another embodiment, the expression of the present protein is used as a marker for 
interleukin 1, e.g. IL-1 alpha, activity in cells or in a patient. Specifically, as it has been shown that 

25 LDH expression is induced by IL-1 alpha, then the expression, or elevated expression, of the 
present protein can be used as a marker for the action of IL-1 on the cell. As IL-1 has been 
implicated in a number of physiological processes, including inflammation and more specifically in 
deleterious processes such as arthritis and autoimmune disorders, the present protein can serve as a 
marker for the presence of such disorders, or for a predisposition for the disorders. 

30 In another embodiment, the present protein is used to detect heart disease and other 

diseases in patients. For example, levels of LDH are known to rise following myocardial 
infarction and other heart ailments. Accordingly, the detection of an elevated level of the 
protein of the invention, alone or in view of the levels of other proteins such as other LDH 
isozymes, can be used as an indicator of a heart attack or other diseases, including 

35 leukemia. The levels of LDH can be assessed in any tissue or biological sample, including, 

but not limited to, serum, and can be detecting using any standard method, including, but 

not limited to, immunoassays and assays for LDH enzyme activity. 
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In another embodiment, the present protein is used to determine a prognosis for any 
of a number of diseases, including cancers such as small cell lung carcinoma. For example, 
the level of the present protein is detected in the serum of a patient suffering from cancer, 
wherein the detection of a decreased level of expression or activity of the protein indicates a 
5 worse prognosis for the patient compared to the prognosis in a patient with a normal level 
of the protein activity or expression. 

Proteins of SEP ID NOs: 243. 253 (internal designation numbers 105-016-1-0-D3-CS and 105-095- 
2-0-G11-CS) 

The 331-amino-acid- long protein of SEQ ID NO:243, encoded by the cDNA of SEQ ID 

1 0 NO:2, is found in prostate and in fetal brain and is homologous to a secreted human protein (Genseq 
accession number Y59685). In addition, this protein is highly homologous to the the putative 
glycerophosphodi ester phosphodiesterase (GP-PDE) MIR16 (Membrane Interacting protein of 
RGS16) protein (SPTREMBLNEW SPTREMBL SWISSPROT accession number AAF65234) 
encoded by the cDNA of GENPEPT GENPEPTNEW accession number AF2 12862; in fact, the 

15 protein of the invention is a likely variant of the MIR16 protein. Furthermore, a BLAST search 
with the amino acid sequence of SEQ ID NO:243 indicates that the protein of the invention is 
homologous to GP-PDEs of E.coli (SWISSPROT accession numbers P09394 and P10908) and 
Haemophilus influenzae (SWISSPROT accession number Q06282). The protein of SEQ ID 
NO:243 displays 2 candidate membrane-spanning segments, from amino acids 7 to 27 and 258 to 

20 278, and a putative signal peptide from amino acids 19 to 24. Finally, the protein of the invention 
has two putative Af-glycosylation sites: asparagine residues at positions 168 and 198 (Zheng et al., 
Proc. Natl. Acad. Sci. 97 :3999-4004 (2000)). 

The cDNA of SEQ ID NO:2 differs from the cDNA of GENPEPT GENPEPTNEW 
accession number AF212862 by its extended 5' and 3' termini, and from the cDNA of SEQ ID 

25 NO: 12 by polymorphisms and alternate splicings. 

The MIR 16 (Membrane Interacting protein of RGS16) protein, which is homologous to the 
protein of the invention, was identified in a yeast two-hybrid screen of a pituitary cell cDNA library 
using the RGS16 (Regulator of G protein Signaling) protein as bait (Zheng et al., Proc. Natl. Acad. 
Sci. 97:3999-4004 (1999)). and Sasaki, J. Bacteriol. 175:4569-4571 (1993); Zheng et al., ibid.). 

30 Remarkably, the GP-PDE from Haemophilus influenzae (also called protein D) which is 67% 

identical to the periplasmic GP-PDE of E.coli, presents affinity for human immunoglobin D (Janson 
et al., Infect. Immun. 62:4848-854 (1994)). 

From sequence alignments, it can be seen that the N-terminal region of MIR16 (amino 
acids 70-150), immediately after the putative signal peptide, is highly conserved (40-61% 

35 similarity), suggesting that it may contain residues critical for catalytic activity, i.e., the catalytic 
site. GP-PDEs hydrolyze deacetylated phospholipid GPs, such as glycerophosphocholine (GPC) 
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and glycerophosphoethanolamine, to sn-glycerol-3 -phosphate (G3P) and the corresponding alcohols 
(Zheng et al., ibid.). The putative enzymatic activity of MIR16 and its interaction with RGS16 
suggest that it may play important roles in lipid metabolism and in G protein signaling. As shown 
in northern blot experiments, the MIR16 mRNA is highly transcribed in heart, liver, kidney, testis 
5 and brain. The observed expression of MIR 16 in the brain is consistent with the above-described 
expression of the protein of the invention in the fetal brain. 

It is believed that the proteins of SEQ ID NOs:243 and 253 or part thereof are members of 
the glycerophosphodi ester phosphodiesterase protein family, interact with the RGS16 protein and, 
as such, play important roles in both lipid metabolism and in G protein signaling. Preferred 
10 polypeptides of the invention are polypeptides comprising the amino acids of SEQ ID NO:243 from 
positions 7 to 27, 19 to 24 and 258 to 278. Other preferred polypeptides of the invention are 
fragments of SEQ ID NO:243 or 253 having any of the biological activities described herein. 
Additional preferred polypeptides are those that comprise asparagine residues at positions 168 
and/or 198. 

15 The invention first relates to methods and compositions using cDNAs of SEQ ID NO:2 or 

12 or part thereof, and proteins of the invention SEQ ID NO:243 or 253 or part thereof to identify 
specific cell types, preferably from prostate or fetal brain. For example, nucleic acids and proteins 
of the invention are labeled isotopically or chemically following methods known to those skilled in 
the art, and further used as probes in northern blots, far-western blots and in situ hybridization 

20 detection experiments. An ability to detect specific cell types is useful, e.g. for the determination of 
the history of tumor cells, as well as for the identification of cells and tissues for histological 
studies. 

Any of a number of in vitro assays can be used to detect SEQ ID NO:243 or 253 protein 
activity, for example for in vitro screening of modulators of protein activity. Preferably cDNA 

25 encoding the protein of the invention is cloned in a prokaryotic expression vector, according to 
methods known to those skilled in the art. Briefly, the GP-PDE activity of the recombinant protein 
is analyzed by a coupled spectrophotometric assay as described by Larson and collaborators and 
adapted by Cameron and collaborators (Larson et al., J. Biol. Chem. 258 :5426-5432 (1983); 
Cameron et al., Infect. Immun. 66 :5763-5770 (1998)). Such enzymatic activity may be measured 

30 in vitro in the presence of modulating drugs. 

Another embodiment of the present invention relates to methods of using the protein of the 
invention or part thereof to purify or specifically bind to human immunoglobin D. Several 
immunoglobin (Ig) binding bacterial cell wall proteins have been isolated and/or cloned during the 
last two decades. The best characterized of these are protein A of Staphylococcus aureus (which 

35 binds to human IgG subclasses 1, 2 and 4, IgG of several mammalians species, and in some 
instances human Ig of classes A, M, E), and protein G of group G beta-hemolytic streptococci 
(which binds to all human IgG subclasses and which also displays a wider binding spectrum for 
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animal IgG than protein A). IgD binds to neither protein A nor protein G. Consequently, it is of 
great interest to identify new proteins capable of binding IgD, thereby allowing its separation and 
purification. In addition, IgD binding proteins can also be used in immunoprecipitation procedures 
with IgD, as are routinely performed with proteins A and G in the case of IgG. The binding and 
5 purification of IgD using the protein of the invention can be accomplished in any of a number of 
ways, for example by generating a fusion protein or polypeptide in which the protein of the 
invention or part thereof, is combined with another protein by the use of a recombinant DNA 
molecule. The resulting fusion product including the protein of the invention or part thereof is then 
covalently, or by any other means, bound to a protein, carbohydrate or matrix (such as gold, 

10 "Sephadex" particles, polymeric surfaces). Such a complex is very useful for IgDs immobilization 
and consecutive immunoprecipitations in batch. Similar assays for binding of protein D (GP-PDE) 
of Haemophilus influenzae and IgD are described in the US Patent No. 6,025,484. 

Another embodiment of the invention relates to compositions and methods using the protein 
of the invention, or part thereof, as GP-PDE enzymes to hydrolyze deacylated phospholipids (GPs), 

15 such as glycerophosphocholine (GPC) and glycerophosphoethanolamine, to sn-glycerol-3- 

phosphate (G3P) and the corresponding alcohols. First, this enzymatic activity, which belongs to 
the class of specific phospholipase D, makes the protein of the invention very useful to study 
biological membranes and their phospholipidic components. Moreover, as glycerophospholipids 
are major components of the lipidic bilayer, elimination of their hydrophilic moiety using the GP- 

20 PDE activity of the protein of the invention would likely modify the structure and consequently the 
permeability of eukaryotic cell membranes. Such modifications could improve the transfection 
efficiency of eukaryotic cells, in vitro or in vivo. Typically, in such embodiments the purified 
protein of SEQ ID NOs:243 or 253 is administrated to cells; purified proteins of the invention can 
be obtained in any of a number ways, for example by inserting the cDNA encoding the proteins into 

25 a prokaryotic expression vector using any technique known to those skilled in the art. The 

recombinant protein produced and purified in the prokaryotic system is then added to an in vitro 
culture of eukaryote cells before or during transfection. The recombinant protein of the invention 
can also be used to increase the efficiency of cell transfection in vivo, most notably in the case of 
gene therapy. For example, tumoral masses are very often resistant to transfection, and the protein 

30 of the invention would likely provide an effective way to facilitate the introduction of cytotoxic 
genes (such as pro-apoptotic genes) or antitumoral drugs in solid tumors. 

Still another embodiment of the protein of the invention relates to methods and 
compositions to diagnose, treat, and prevent disorders associated with excess glutamate signaling in 
the brain. As described above, the MIR16 protein interacts physically with the RGS16 protein 

35 (Regulator of G protein Signaling 16). Receptors of many hormones use heterotrimeric G proteins 
for signal transduction after ligand binding (for a review, see Neer, Cell 80 :249-257 (1995)). 
Among these receptors are metabotropic glutamate receptors (mGluRs). These receptors, which are 
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expressed in the brain, like the protein of the invention, are a novel family of cloned G-protein- 
coupled receptors (Schoepp and Conn, Trends Pharmacol. Sci. 14:13-20 (1993)). Endogenous 
glutamate, by activating the mGluRl receptor (and also NMD A and AMP A receptors), may 
contribute to the brain damage occurring acutely after epilepsy, cerebral ischemia or traumatic brain 
5 injury. It may also contribute to chronic neurodegeneration in such disorders as amyotrophic lateral 
sclerosis and Huntington's chorea (Meldrum, J. Nutr. 130(4S Suppl):1007S-1015S (2000)). 

The invention thus relates to methods and compositions using cDNAs of SEQ ID NO:2 or 
12 or part thereof, and proteins of SEQ ID NO:243 or 253 or part thereof, to diagnose, treat, or 
prevent disorders associated with excess glutamate signaling in the brain. Specifically, the level of 

10 activity or expression of the proteins can be correlated with the level of glutamate signaling, or with 
the glutamate-signaling associated brain damage involved in epilepsy, cerebral ischemia, traumatic 
brain damage, ALS, or Huntington's chorea, or with any other G-protein associated physiological 
process or disease or condition. For situations where the level of the expression or activity of the 
protein is positively correlated with such signaling or with the presence of a disease or condition, 

15 the signaling, disease or condition can be detected using any of a number of tools for detecting 
protein expression or activity, including northern blots, far-western blots and in situ hybridization 
experiments, where an elevated level of the protein, protein activity, or nucleic acid of the invention 
indicates the presence of the disease, condition, or signaling process. Further, such diseases or 
conditions can be treated or prevented, or such signaling pathways can be inhibited, using 

20 compounds that inhibit the expression or activity of the protein, such as antibodies, antisense 

molecules, ribozymes, dominant negative forms of the protein, or any heterologous molecule that 
inhibits protein activity or expression. Alternatively, where the expression or activity of the protein 
of the invention is negatively associated with the signaling pathway, disease or condition, a 
detection of a decreased level of expression or activity of the protein can be used to indicate the 

25 presence of the disease, condition, or pathway. Further, in such cases, the disease or condition can 
be treated or prevented, or the pathway be inhibited, using any compound that increases the activity 
or level of the protein, such as nucleic acids encoding the protein, the protein itself, or heterologous 
compounds that cause an increase in the level of protein expression or activity. 

Protein of SEP ID NO:386 (internal designation 105-037-4-O-H12-CS) 

30 The protein of SEQ ID NO:386, encoded by the cDNA of SEQ ID NO: 145, is strongly 

expressed in the fetal brain and uterus. The 207-amino-acid-long protein of SEQ ID NO:386 

displays pfam SPRY domains from positions 85 to 205. 

SPRY domains have been found in a number of proteins involved in multiple cellular and 

developmental processes. For example, the Midline- 1/FXY family of proteins has been shown to 
35 associate with microtubules, and has been implicated in human diseases, such as Opitz Syndrome, a 

congenital disorder characterized by multiple developmental abnormalities (see, e.g., Cainarca, et 
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al., (1999) Hum Mol Genet 8(8): 1387-96). In addition, the cytoplasmic Marenostrin/Pyrin protein 
has been demonstrated to be the cause of Familial Mediterranean fever, an autosomal recessive 
disorder characterized by fever and serositis (Nat Genet 1997 Sep;17(l):25-31). Other SPRY 
proteins include SplA, a serine protease from Staphylococcus aureus, and butyrophilin, a major 
5 milk protein. Another family of proteins known to contain the SPRY domain are the Ryanodine 
receptors (RyRs). 

Ryanodine receptors play an important role in Ca2+ signaling in muscle and non muscle 
cells by releasing Ca2+ from intracellular stores. For example, these receptors are centrally 
important in excitation-contraction (e-c) coupling, which occurs at specialized regions where the 

10 sarcoplasmic reticulum (SR), containing the ryanodin receptors, and the plasma 

membrane/transverse-tubule system form junctions. RyRs are also thought to play some role in 
maintaining the structural integrity of the SR T-tubule junctions. RyR is apparently unable to carry 
out the requisite functions associated with e-c coupling by itself, however, because it forms 
interactions with other macromolecules at the triad junction. For example, two small proteins, 

15 calmodulin and FKBP12, are believed to modulate RyR at the triad junction. 

It is believed that mammalian tissues express three different RyR isoforms, comprising four 
560-kDa (RyR polypeptide) and four 12-kDa (FK506 binding protein) subunits. It is believed that 
these large protein complexes conduct monovalent and divalent cations and are capable of multiple 
interactions with other molecules. The subunits of the protein complexes include small diffusible 

20 endogenous effector molecules including Ca2+, Mg2+, adenine nucleotides, sufhydryl modifying 
reagents (glutathione, NO, and NO adducts) and lipid intermediates, and proteins such as protein 
kinases and phosphatases, calmodulin, immunophilins (FK506 binding proteins), and in skeletal 
muscle the dihydropyridine receptor. The RyR from skeletal muscle is the major calcium release 
channel for that tissue, and the most intensively studied of the three genetic isoforms detected thus 

25 far in mammalian species. The other two RyR isoforms are often referred to as the 'heart' and 'brain' 
forms, but the actual cell and tissue distribution of the isoforms is complex. 

Because of their multiple ligand interactions, ryanodin receptors constitute an important, 
potentially rich pharmacological target for controlling cellular functions. Ca2+ release channel 
activity is modulated by many endogenous effectors, including Ca2+, ATP, Mg2+, and calmodulin. 

30 In addition, many exogenous effectors, including caffeine, local anesthesics, and polyamines, also 
modify channel activity. For example, tetracaine, procaine, benzocaine, and lidocaine inhibit Ca2+ 
release from the SR. They appear to interact with a specific site(s) located on the RYR, affecting 
both ryanodin-binding and single channel activities (Shoshan-Barmatz et al. 1993; J. Membr. Biol.; 
133; 171-181). 

35 The importance of intracellular calcium as a second messenger in cellular signal 

transduction processes is well established. Alterations in intracellular Ca2+ homeostasis have 
profound effects on many cell functions, including secretion, contraction-relaxation, motility, 
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metabolism, protein synthesis, modification and folding, gene expression, cell-cycle progression 
and apoptosis. A major source of cytoplasmic calcium is from intracellular storehouses located in 
the endoplasmic reticulum, or in muscle, within the sarcoplasmic reticulum (SR). 

Given that cellular Ca2+ handling is an important factor in the control of neuronal 
5 metabolism and electrical activity, abnormalities of intracellular Ca2+ channels might be expected 
to contribute to some forms of epilepsy or to anoxic brain damage following an episode of cerebral 
ischemia. Cell loss is said to be a characteristic feature of degenerative brain disorders, including 
Alzheimer's disease. It is well established that neuronal cell death may be secondary to an 
abnormal elevation of cytoplasmic Ca2+, particulary that associated with activation of excitatory 

10 glutamate receptors (e.g., in epilepsy). This strongly suggests that the release of stored Ca2+ 
contributes to nerve cell damage and cell death in various circumstances. 

It is believed that the protein of SEQ ID NO:386 is functionally related to other SPRY- 
containing proteins, such as the ryanodine receptors, Marenostrin/Pyrin, SplA, Midline- 1 /FX Y, and 
butyrophilin. Accordingly, it is thus believed that the present protein is associated with the release 

15 of Ca2+ from intracellular Ca2+-storing organelles, like the endoplasmic reticulum and, in muscle, 
the sarcoplasmic reticulum (SR), as well as being involved in microtubule binding. Preferred 
polypeptides of the invention are any fragments of SEQ ID NO:386 having any of the biological 
activities described herein. 

In one embodiment, the present protein and nucleic acids can be used to specifically detect 

20 cells of the fetal brain and uterus, as the protein is overexpressed in these tissues. For example, the 
protein of the invention or part thereof may be used to synthesize specific antibodies using any 
technique known to those skilled in the art. Such tissue-specific antibodies may then be used to 
identify tissues of unknown origin, such as in forensic samples, differentiated tumor tissue that has 
metastasized to foreign bodily sites, etc., or to differentiate different tissue types in a tissue cross- 

25 section using immunochemistry. The protein can also be used to specifically label microtubules in 
cells. 

In another embodiment, the protein of the invention or part thereof may be used in 
regulating intracellular Ca2+ levels. As alterations in intracellular Ca2+ homeostasis have profound 
effects on many cell functions, including secretion, contraction-relaxation, motility, metabolism, 

30 protein synthesis, modification and folding, gene expression, cell-cycle progression and apoptosis, 
the ability to modulate intracellular Ca2+ levels provides a tool to alter any of these cellular 
functions, in vitro or in vivo. Such an ability has wide utility for a large number of applications, for 
example to manipulate the behavior (e.g. growth rate, secretion, survival, etc.) of cells grown in 
vitro, as well as to treat, prevent, or diagnose any of a number of diseases associated with altered 

35 Ca2+ signaling in vivo. The activity or expression of the protein of the invention can be modulated 
in any of a large number of ways, for example by administering to cells or to a patient the protein 
itself, a polynucleotide encoding the protein, antibodies, antisense sequences, dominant negative 
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forms of the protein, compounds that alter the expression or activity of the protein, etc. The effect 
of any such agent on calcium flux in cells can be detected using standard methods, including by 
studying the permeation of Ca2+ release through endoplasmic reticulum (ER) and sarcoplasmic 
reticulum (SR) channels using tracers, light scattering and fluorescence quenching, and channel 
5 reconstitution in planar bilayer. In addition, targeted recombinant photoproteins can provide direct 
measurements of organellar Ca2+ (Montero et al.; 1995; EMBO J.; 14, 5467-5475). 

The invention further relates to methods and compositions using the protein of the invention 
or part thereof to diagnose, prevent and/or treat several disorders in which the activity or 
recognition of ryanodin receptors, is impaired or excessive. These disorders include, but are not 

10 limited to, neurodegenerative diseases, cardiovascular disorders, severe myasthenia, malignant 
hyperthermia, epilepsy, and central core disease. For example, in patients with severe myasthenia, 
the level of anti-RyR antibodies has been directly related to the severity of the disease (Skeie et al., 
1996: Eur. J. Neurol. 3; 136-140). There is also some evidence to suggest that RyR abnormalities 
are a primary cause of many types of cardiac disease. In addition, the protein of the invention can 

15 be used to diagnose other diseases associated with SPRY-protein dysfunction, such as Familial 
Mediterranean fever and Opitz syndrome. Finally, as SPRY containing proteins have been 
implicated in embryonic development (e.g. the Midline 1 protein), the protein and nucleic acids of 
the invention can be used to detect developmental disorders, as the detection of a mutation in the 
gene encoding SEQ ID NO:386, or a detection of abnormal gene expression in a fetus, can be used 

20 to indicate the presence of a developmental abnormality. For example, as the protein of SEQ ID 
NO:386 is strongly expressed in the fetal brain, it is likely that the protein plays a role in the normal 
development of the brain in utero. 

The present invention also relates to diagnostic assays for detecting altered levels of the 
protein of SEQ ID NO:386 in various tissues, as over-expression of the protein compared to normal 

25 control tissue samples can indicate the presence of certain disease conditions such as 

neurodegenerative disorders, cardiovascular disorders, svere myasthenia, malignant hyperthermia, 
epilepsy, and central core disease. Assays used to detect levels of the polypeptide of the present 
invention in a sample derived from a host are well-known to those of skill in the art and include 
radioimmunoassays competitive-binding assays, Western Blot analysis and ELISA assays. 

30 Proteins SEP ID NOs:283 and 286 (internal designations 1 74-38-1 -0B6-CS LA and 174-41-1-0- 
A6-CS LA) 

The protein of SEQ ID NO:283, encoded by the cDNA of SEQ ID NO:42, is overexpressed 
in salivary glands and to a lesser extent in bone marrow, and shows homology over the C-terminal 
length to the immunoglobin (Ig) protein superfamily, which is conserved among eukaryotes 
35 (including rabbit, rodents and human). In particular, the 468-amino-acid-long protein of the 
invention, which is similar in size to the constant chain of Ig related proteins, displays two pfam 
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conserved immunoglobulin domains, from position 205 to 285 and from position 3 1 8 to 384, which 
are known to be involved in the basic structure of the light and heavy constant chains of 
immunoglobins. It is known (Orr H.T., Nature 282:266-270(1979)) that the Ig constant chain 
domains and a single extracellular domain in each type of MHC chain are closely related, sharing 
5 over one hundred amino-acids of homology. All members of the Ig related superfamily, including 
the MHC class I alpha chain and beta-2 -microglobulin, as well as the MHC class II alpha and beta 
chains, display the prosite conserved characteristic pattern around the C-terminal cysteine ([FY]-x- 
C-x-[VA]-x-H). This cysteine is involved in the disulfide bond between the light and heavy chains, 
and is also found in the protein of the invention (position 380 to 386). The protein of the invention 

10 also exhibits an emotif Ig and Major Histocompatibility Complex protein signature from positions 
3 19 to 336. In addition, the protein of the invention displays homology with tapasin (GeneBank 
No. AF009510), a chaperone-like protein closely associated with TAP-binding proteins, which is 
well conserved among eukaryotes (chicken, rodents and human). Tapasin has been shown to 
increase the efficiency of antigen processing and presentation by mediating the association of MHC 

1 5 complex proteins with TAP proteins to the endoplasmic reticulum and to the cell surface during 
immune response (for review see Abele, R. and Tampe, R., Bioch. et Biophysica Acta, 1999). In 
addition, the protein of the invention displays two transmembrane domains from positions 199 to 
2 1 9 and from positions 406 to 426 , a hydrophobic profile similar in amino acid position to the 
hydrophobic stretch of amino acids of human and mouse tapasin (Suling L., J. Biol. Chem., 

20 274:8649-8654, 1999), and a secreted signal peptide from position 9 to 23. Both signatures are 
largely present in Ig related proteins such as secreted antibodies or antigen presenting proteins. The 
invention also encompasses a variant (SEQ ID NO:286) of SEQ ID NO:283, encoded by the cDNA 
of SEQ ID NO:45. The protein of SEQ ID No:286 is a 442-amino-acid-long protein with a C- 
terminal shorter end of 26 amino-acids compared to the protein of SEQ ID NO:283. The variant of 

25 SEQ ID NO:286, which results from a frameshift (position 1445 in SEQ ID NO:45) in the coding 
sequence that leads to a stop codon in the corresponding protein, displays characteristics identical to 
those described above in terms of motifs, Ig signatures, function, and potential uses. 

The immunoglobulin (Ig) gene superfamily comprises a large number of cell surface 
glycoproteins that share sequence homology with the V and C domains of antibody heavy and light 

30 chains. These molecules function as receptors for antigens, immunoglobulins and cytokines as well 
as adhesion molecules, and play important roles in regulating the complex cell interactions that 
occur within the immune system (A. F. Williams et al., Annu. Rev. Immuno. 6:381-405, 1988, T. 
Hunkapiller et al., Adv. Immunol. 44:1-63, 1989; for a short review see also Prosite entry PS00290) 
The introduction of an antigen into a host initiates a series of events culminating in an 

35 immune response. In addition, self-antigens can result in immunological tolerance or activation of 
an immune response against self-antigens. A major portion of the immune response is regulated by 
presentation of antigen by major histocompatibility complex molecules. MHC molecules bind to 
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peptide fragments derived from antigens to form complexes that are recognized by T cell receptors 
on the surface of T cells, giving rise to the phenomenon of MHC-restricted T cell recognition. The 
ability of a host to react to a given antigen (responsiveness) is influenced by the spectrum of MHC 
molecules expressed by the host. Responsiveness correlates with the ability of specific peptide 
5 fragments to bind to particular MHC molecules. 

There are two types of MHC molecules, class I and class II, each of which comrise two 
chains. In class I [2], the alpha chain is composed of three extracellular domains, a transmembrane 
region, and a cytoplasmic tail. The beta chain (beta-2 -microglobulin) is composed of a single 
extracellular domain. In class II [3], both the alpha and the beta chains are composed of two 

10 extracellular domains, a transmembrane region and a cytoplasmic tail. MHC class I molecules are 
expressed on the surface of all cells, and MHC class II molecules are expressed on the surface of 
antigen presenting cells. MHC class II molecules bind to peptides derived from proteins made 
outside of an antigen presenting cell. In contrast, MHC class I molecules bind to peptides derived 
from proteins made inside a cell. In order to present peptide in the context of a class II molecule, an 

15 antigen presenting cell phagocytoses an antigen into an intracellular vesicle, in which the antigen is 
cleaved, bound to an MHC class II molecule, and then returned to the surface of the antigen 
presenting cell. 

Major histocompatibility complex (MHC) class I molecules present antigenic peptides to 
CD8 T cells (Townsend, A. et al., Nature:340 ,443-448)). The peptides are generated in the cytosol 

20 and then translocated across the membrane of the endoplasmic reticulum by the transporter 

associated with antigen processing (TAP). TAP is a trimeric complex consisting of TAP 1, TAP2, 
and tapasin (TAP-A). TAP1 and TAP2 are required for the peptide transport. Tapasin mediates the 
interaction of MHC class I HC-beta-2 microglobulin with TAP, and this interaction is essential for 
peptide loading onto MHC class I HC-beta-2 -microglobulin (Suling et al., J. Biol. Chem., 

25 274:8649-8654). T cell receptors (TCRs) are the second antigen recognition molecules, and 
recognize antigens that are bound by MHC molecules. Recognition of MHC complexed with 
peptide (MHC-peptide complex) by TCR can effect the activity of the T cell bearing the TCR. 
Thus, MHC-peptide complexes are important in the regulation of T cell activity and, thus, in 
regulating an immune response. 

30 Human cytomegalovirus (HCMV) is a betaherpesvirus which causes clinically serious 

disease in immunocompromised and immunosuppressed adults, as well as in some infants infected 
in utero or perinatally (Alford, C. A., and W. J. Britt. 1990. Cytomegalovirus, p. 1981-2010. In D. 
M. Knipe and B. N. Fields (ed.), Virology, 2nd ed. Raven press, New York). In human 
cytomegalovirus (HCMV)-infected cells, expression of the cellular major histocompatibility 

35 complex (MHC) class I heavy chains is down-regulated, where down-regulation is defined as 

reduction in either synthesis, stability or surface expression of MHC class I heavy chains. A similar 
phenomenon has been reported for some other DNA viruses, including adenovirus, murine 
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cytomegalovirus, and herpes simplex virus (Anderson, M., et al., Cell 43:215-222, 1985; Burgert 
andKvist, Cell 41:987-997, 1985;HeiseT. M.,etal.,J. Exp. Med. 187:1037-1046, 1998). In the 
adenovirus and herpes simplex virus systems, the product of a viral gene which is dispensable for 
replication in vitro is sufficient to cause down-regulation of MHC class I heavy chains (Anderson, 
5 M., et al., 1985, supra). The gene(s) involved in class I heavy chain down-regulation by murine 
cytomegalovirus have not yet been identified. 

It is believed that the proteins of SEQ ED NOs:283 and 286 are members of the 
immunoglobulin superfamily and, as such, play a role in the immune response, cellular proteolysis, 
cell proliferation and differentiation, pathogen recognition, apoptosis, and other processes 

10 associated with the Ig superfamily. In addition, the proteins of the invention are thought to be 
tightly linked to the antigen processing and presentation system in the context of peptide assembly 
and translocation of foreign peptides across endoplasmic reticulum and cell surface membranes as 
new chaperonin-like proteins associated with MHC I and TAP proteins. The weak homology (30%) 
with the TAP protein family is thought to indicate the specificity of the interactions of the proteins 

15 of the invention with MHC proteins and/or TAP-related proteins, as described by Suling et al., 
supra. 

Preferred polypeptides of the invention are polypeptides comprising the amino acids of 
SEQ ID NO:283 from position 9 to 23, 199 to 219, 205 to 285, 318 to 384, 319 to 336, 380 to 386 
and from 406 to 426. Other preferred polypeptides of the invention are fragments of SEQ ID 

20 NO:283 having any of the biological activities described herein. 

In one embodiment, the invention relates to methods and compositions for using the protein 
of the invention or part thereof as a marker protein to selectively identify tissues, such as salivary 
glands and bone marrow tissues, which strongly express the protein of the invention. For example, 
the protein of the invention or part thereof may be used to synthesize specific antibodies using any 

25 techniques known to those skilled in the art including those described therein. Such tissue-specific 
antibodies may then be used to identify tissues of unknown origin, for example, forensic samples, 
differentiated tumor tissue that has metastasized to foreign bodily sites, or to differentiate different 
tissue types in a tissue cross-section using immunochemistry. 

In another embodiment, the invention relates to methods for using the protein of the 

30 invention to visualize proteins and peptides involved in antigen recognition system within cells by 
virtue of their physical interaction with the proteins of the invention. For example, the protein may 
be used to detect the presence and/or the localization of MHC peptides and TAP- like proteins in a 
cell. The protein of the invention, and hence any interacting proteins, can be labeled using any of a 
number of methods, including by binding with specific antibodies or by creating a fusion protein 

35 comprising the protein of the invention as well as a readily detectable moiety, such as an epitope 
tag, biotin, or green fluorescent protein. 
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In another embodiment, polynucleotide or polypeptide sequences of the invention or part 
thereof may be used for the diagnosis of a disorder associated with a loss of regulation of the 
expression of the protein of the invention, preferably, but not limited to, deficiencies of the MHC 
protein system. Examples of such disorders include, but are not limited to, acquired 
5 immunodeficiency syndrome (AIDS), X-linked agammaglobinemia of Bruton, common variable 
immunodeficiency (CVI), DiGeorge's syndrome (thymic hypoplasia), thymic dysplasia, isolated 
IgA deficiency, severe combined immunodeficiency disease (SCED), immunodeficiency with 
thrombocytopenia and eczema (Wiskott-Aldrich syndrome), Chediak-Higashi syndrome, chronic 
granulomatous diseases, hereditary angioneurotic edema, immunodeficiency associated with 

10 Cushing's disease, Addison's disease, adult respiratory distress syndrome, allergies, ankylosing 
spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, 
autoimmune thyroiditis, bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic 
dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with 
lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 

1 5 glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, 

hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or 
pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 
syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic 
lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 

20 Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, leukemias 
such as multiple myeloma, and lymphomas such as Hodgkin's disease; a cell proliferative disorder 
such as arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease 
(MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, 
primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, 

25 melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal gland, 
bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, 
kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 
spleen, testis, thymus, thyroid, and uterus; and an infection, such as infections by viral agents 
classified as adenovirus, arenavirus, bunyavirus, calicivirus, coronavirus, filovirus, hepadnavirus, 

30 herpesvirus, flavivirus, orthomyxovirus, parvovirus, papovavirus, paramyxovirus, picornavirus, 

poxvirus, reovirus, retrovirus, rhabdovirus, and togavirus; infections by bacterial agents classified as 
pneumococcus, staphylococcus, streptococcus, bacillus, corynebacterium, Clostridium, 
meningococcus, gonococcus, listeria, moraxella, kingella, haemophilus, legionella, bordetella, 
gram-negative enterobacterium including shigella, salmonella, and Campylobacter, pseudomonas, 

35 vibrio, brucella, francisella, yersinia, bartonella, norcardium, actinomyces, mycobacterium, 
spirochaetale, rickettsia, chlamydia, and mycoplasma; infections by fungal agents classified as 
aspergillus, blastomyces, dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, and 
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other fungal agents causing various mycoses; and infections by parasites classified as Plasmodium 
or malaria-causing, parasitic entamoeba, leishmania, trypanosoma, toxoplasma, Pneumocystis 
carinii, intestinal protozoa such as giardia, trichomonas, tissue nematodes such as trichinella, 
intestinal nematodes such as ascaris, lymphatic filarial nematodes, trematodes such as schistosoma, 
5 and cestrodes such as tapeworm. To assess abnormal expression of the present protein associated 
with any of these disorders, the level of the present polynucleotides or polypeptides can be detected 
in a biological sample or cell using any standard method, including Southern or northern analysis, 
dot blots, other membrane-based technologies, PCR technologies, dipstick, pin, ELISA assays, and 
in microarrays. Any of these methods may be used for the diagnosis of disorders characterized by 

10 an alteration of expression of SEQ ED NO:283 or 286, such as the disorders mentioned above, or in 
assays to monitor patients being treated with SEQ ID NO:283 or 286 or agonists, antagonists, or 
inhibitors of SEQ ID NO:283 or 286. Antibodies useful for diagnostic purposes may be prepared, 
e.g., in the same manner as that described in U.S. Patent No. 6,135,941 . Diagnostic assays for SEQ 
ID NO:283 or 286 include methods which utilize the antibody and a label to detect SEQ ID NO: 

1 5 283 or 286 in human body fluids or in extracts of cells or tissues. The antibodies may be used with 
or without modification, and may be labeled by covalent or non-covalent attachment of a reporter 
molecule. A wide variety of reporter molecules, several of which are described above, are known in 
the art and may be used. 

In another embodiment, the protein of SEQ ID NO:283 or 286 or a fragment or derivative 

20 thereof may be administered to a subject to diagnose, treat or prevent an immune disorder 

associated with decreased expression or activity of the protein of the invention. Such disorders can 
include, but are not limited to, acquired immunodeficiency syndrome (AIDS), X-linked 
agammaglobinemia of Bruton, common variable immunodeficiency (CVI), DiGeorge's syndrome 
(thymic hypoplasia), thymic dysplasia, isolated IgA deficiency, severe combined immunodeficiency 

25 disease (SOD), immunodeficiency with thrombocytopenia and eczema (Wiskott-Aldrich 

syndrome), Chediak-Higashi syndrome, chronic granulomatous diseases, hereditary angioneurotic 
edema, immunodeficiency associated with Cushing's disease, Addison's disease, adult respiratory 
distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, 
autoimmune hemolytic anemia, autoimmune thyroiditis, bronchitis, cholecystitis, contact dermatitis, 

30 Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic 

lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 
glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, 
hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or 
pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 

35 syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic 
lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 
Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, leukemias 
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such as multiple myeloma, and lymphomas such as Hodgkin's disease. In addition, such disorders 
associated with decreased protein expression or activity can be treated by administering to a patient 
polynucleotide sequences encoding the protein of the invention, e.g. inserted in an appropriate 
vector. In another example, a compound that increases either the activity of the protein of the 
5 invention or their expression can be administered to a patient to treat or prevent any of the diseases 
mentioned above. 

In a further embodiment, an antagonist of the protein of the invention may be administered 
to a subject to treat or prevent an immune disorder associated with increased expression or activity 
of the protein of SEQ ID NO:283 or 286 including, but not limited to, auto-immune deseases or 

10 graft rejection. In one aspect, an antibody which specifically binds the protein of the invention may 
be used directly as an antagonist or indirectly as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissues which express the proteins of the invention, such as the 
salivary gland tissue or the bone marrow tissue. In addition, sense, antisense nucleotides, GSE, 
ribozymes, specific protein inhibitors such as antibodies or small coumpounds can be administered 

15 to inhibit the expression of the proteins of the invention. 

In another embodiment, an antagonist of the protein of SEQ ED NO:283 may be 
administered to a subject to treat or prevent a cell proliferative disorder. Such disorders may 
include, but are not limited to, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 
connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, 

20 polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of 
the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, 
gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, 
prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. In one aspect, an antibody 

25 which specifically binds the protein of the invention may be used directly as an antagonist or 

indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissue 
which express the protein of the invention. In another example, sense, antisense nucleotides, GSE, 
or ribozymes designed from nucleotides of the invention can be administered to inhibit the 
expression of the protein of the invention. 

30 Protein of SEQ ID NO: 411 (internal designation 1 81-10-1-0-C9-CS) 

The protein of SEQ ID NO: 41 1 encoded by the cDNA of SEQ ID No: 170 is highly 
expressed in fetal liver. The protein of the invention is homologous to peripheral benzodiazepine 
receptor/isoquinoline binding protein (PBR/IBP) of human, bovine and murine origin (Genbank 
accession numbers M36035, M64520 and LI 7306 respectively). The 170-amino-acid protein of 
35 SEQ ID NO: 41 1 is similar in size and hydropathic ity to known peripheral PBR/IBP 

benzodiazepine receptors/isoquinoline binding proteins. Like the known peripheral benzodiazepine 
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receptors/isoquinoline binding proteins, the protein of the subject invention has about five potential 
transmembrane domains at positions 3-23, 45-65, 82-102, 105-125 and 130-150. Moreover, the 
protein of the invention displays a stretch of 1 1 amino acids (starting with VI 44 and ending with 
R154) that corresponds to a recently identified putative cholesterol recognition/interaction amino 
5 acid consensus pattern (-LA^-(X)(l-5)-Y-(X)(l-5)-R/K-) [See Li et al, Endocrinology 1998 Dec; 
139 (12): 4991-7]. 

The peripheral benzodiazepine receptor (PBR) is a 18-kDa protein containing binding sites 
for benzodiazepine and is distinct from the GABA neurotransmitter receptor [Papadopoulos, V. 
(1993) Endocr. Rev. 14: 222-240]. Expression of PBR has been found in every tissue examined. 

10 However, it is most abundant in steroidogenic cells and is also found, primarily, on outer 

mitochondrial membranes [Anholt, R et al (1986) J. Biol Chem. 261:576-583]. PBR is thought to 
be associated with a multimeric complex composed of the 1 8-kDa isoquinoline binding protein and 
the 34-kDa pore-forming voltage dependent anion channel protein, preferentially located on the 
outer/inner mitochondrial membrane contact sites [McEnery, M.W. et al Proc. Natl. Acad. Sci. 

15 USA. 89:3170-3174; Gamier, M. et al. (1994) Mol. Pharmacol 45:201-21 1; Papadopoulos, V. et 
al (1994) Mol. Cel. Endocr. 104:R5-R9]. Drug ligands of PBR, upon binding to the receptor, 
simulate steroid synthesis in steroidogenic cells in vitro [Papadopoulos, V et al (1990) J. Biol 
Chem, 265: 3772-3779; Barnea, E. R. et al. (1989) Mol. Cell. Endocr. 64: 155-159; Amsterdam, A. 
and Suh, B.S. (1991) Endocrinology 128: 503-510]. Likewise, in vivo studies showed that high 

20 affinity PBR ligands increase steroid plasma levels in hypophysectomized rats [Amri, H. et al 

(1996) Endocrinology 137:5707-5718]. Further in vitro studies on isolated mitochondria provided 
evidence that PBR ligands, drug ligands, or the endogenous PBR ligand (the polypeptide diazepam- 
binding inhibitor (DBI) [Papadopoulos, V. et al (1997) Steroids 62: 21-28]) stimulate pregnenolone 
formation by increasing the rate of cholesterol transfer from the outer to the inner mitochondrial 

25 membrane [for review, see Culty, M. et al (1999) Journal of Steroid Biochemistry and Molecular 
Biology 69: 123-130]. 

Based on the amino acid sequence of the 18-kDa PBR, a three dimensional model was 
developed [Papadopoulos, V. (1996) In: The Leydig Cell. Payne, A. H. et al. (eds) Cache River 
Press, IL, pp 596-628]. This model was shown to accommodate a cholesterol molecule and 

30 function as a channel, supporting the role of PBR in cholesterol transport. The role of PBR in 
steroidogenesis was also demonstrated by observing that PBR negative cells generated by 
homologous recombination failed to produce steroids [Papadopoulos, V. et al (1997) J. Biol Chem. 
272: 32129-32135]. Further, cholesterol transport experiments in bacteria expressing the 18-kDa 
PBR protein provided definitive evidence for a function as a cholesterol channel/transporter 

35 [Papadopoulos, V. et al (1997) supra]. 

In addition to its role in mediating cholesterol movement across membranes, PBR has been 
implicated in several other physiological functions, including cell growth and differentiation, 

364 



WO 01/42451 PCT/IB00/01938 

chemotaxis, mitochondrial physiology, porphyrin and heme biosynthesis, immune response, anion 
transport and GABAergic regulation of CNS. [for review, see Gavish, M. et al. (1999) 
Pharmaceutical Reviews 51: 629-650; Beurdeley-Thomas, A. et aL (2000) Journal ofNeuro- 
Oncology 46: 45-56]. Also, a recent report also indicates that PBR agonists are potent anti-apoptotic 
5 compounds. These findings suggest that this effect may represent a major function for this receptor 
(Bono, F. et al. (1999) Biochemical and Biophysical Research Communications 265:457-461]. 

It appears that PBR is associated with stress and anxiety disorders. It has been suggested 
that PBRs play a role in the regulation of several stress systems such as the HPA axis, the 
sympathetic nervous system, the renin-angiotensin axis, and the neuroendocrine axis. In these 

10 systems, acute stress typically leads to increases in PBR density, whereas chronic stress typically 
leads to decreases in PBR density. Furthermore, in Generalized Anxiety Disorder (GAD), Panic 
Disorder (PD), Generalized Social Phobia (GSP), and Post-Traumatic Stress Disorders (PTSD), 
PBR density is typically decreased in platelets. 

In the brain, where PBRs are associated with glial cells, PBRs are increased in specific 

15 brain areas in neurodegenerative disorders and also after neurotoxic and traumatic-ischemic brain 
damage [for review, see Gavish, M. et al. (1999) supra]. The literature also reports a decrease in 
peripheral-type benzodiazepine receptors in postmortems of chronic schizophrenics, suggesting that 
the decreased density of PBRs in the brain may be involved in the pathophysiology of 
schizophrenia. Increased levels of PBR in autopsied brain tissue from PSE patients (Portal- 

20 Systemic Encephalopathy patients) have been reported, thus supporting the theory that activation of 
PBR contributes to the pathogenesis characteristic of portal-systemic encephalopathy (PSE) in the 
central nervous system [Kurumaji, A. et al. (1997) J. Neural Transm 104:1361-1370; Butterworth 
R. F. (2000) Neurochemistry International 36: 41 1-416]. 

In addition to its involvement in the neurological disorders discussed supra, PBR has been 

25 implicated in the regulation of tumor cell proliferation [for review, see Gavish, M. et al. (1999) 
supra; Beurdeley-Thomas, A. et al. (2000) supra; Hardwick, M. (1999) Cancer Research 59:831- 
842; Venturini, I. et al. (1998) Life Sci 63:1269-80; Carmel I et al. (1999) Biochem Pharmacol 58: 
273-8]. The invasiveness and metastatic ability of human breast tumor cells is proportional to the 
level of PBR expressed. Further, PBR has been proposed to be used as a tool/marker for detection, 

30 diagnosis, prognosis and treatment of cancer [WO 99/493 16, hereby incorporated by reference in its 
entirety]. 

Many ligands have been described that bind to peripheral benzodiazepine receptor with 
various affinities. Some benzodiazepines, Ro 5-4864 [4-chlorodiazepam], diazepam and structurally 
related compounds, are potent and selective PBR ligands. Exogenous ligands also include 2- 
35 phenylquinoline carboxamides (PK1 1 195 series), imidazo [l,2-a]pyridine-3-acetamides (Alpidem 
series) and pyridazine derivatives. Some endogenous compounds, including porphyrins and 
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diazepam binding inhibitor (DBI), bind to PBR with nanomolar and micromolar affinity [for 
review, see Gavish, M. et al. (1999) supra; Beurdeley- Thomas, A. et al. (2000) supra]. 

The protein of SEQ ID NO: 411 is a novel peripheral -type benzodiazepine receptor. As 
such, it is serves a channel function that mediates cholesterol movement across membranes, play a 
5 role in steroidogenesis, cell growth and differentiation, chemotaxis, mitochondrial physiology, 
protection against apoptosis, porphyrin and heme biosynthesis, immune response, anion transport 
and GABAergic regulation of CNS. 

In one embodiment, a preferred polypeptide of the invention comprises the amino acids of 
SEQ ED NO: 41 1 from position 144 to 154. In another embodiment, the subject invention provides 
10 a polypeptide comprising the sequence of SEQ ED NO: 411. Other preferred polypeptides of the 
invention include biologically active fragments of SEQ ED NO: 411. Biologically active fragments 
of the protein of SEQ ED NO: 411 have any of the biological activities described herein which are 
associated with the PBR. In another embodiment, the polypeptide of the invention is encoded by 
clone 181-10-1-0-C9-CS. 

15 One aspect of the subject invention provides compositions and methods using the protein of 

the invention, or biologically active fragments thereof, for the development, identification, and/or 
selection of agents capable of modulating the expression or activity of the protein of the invention. 

Agents which modulate the activity of the PBR/IBP of the subject invention include, but are 
not limited to, antisense oligonucleotides, ribozymes, drugs, and antibodies. These agents may be 

20 made and used according to methods well known in the art. Also, the protein of the invention, or 
biologically active fragments thereof, may be used in screening assays for therapeutic compounds. 
A variety of drug screening techniques may be employed. In this aspect of the invention, the 
protein or biologically active fragment thereof, may be free in solution, affixed to a solid support, 
recombinantly expressed on, or chemically attached to, a cell surface, or located intracellularly. 

25 The formation of binding complexes, between the protein of the invention, or biologically active 
fragments thereof, and the compound being tested, may then be measured. 

In one embodiment, the subject method utilizes eukaryotic or prokaryotic host cells which 
are stably transformed with recombinant nucleic acids expressing the PBR/EBP polypeptide or 
biologically active fragments thereof. The transformed cells may be viable or fixed. Drugs or 

30 compounds which are candidates for the modulation of the PBR/EBP, or biologically active 
fragments thereof, are screened against such transformed cells in binding assays well known to 
those skilled in the art. Alternatively, assays such as those taught in Geysen H. N., WO Application 
84/03564, published on Sep. 13, 1984, and incorporated herein by reference in its entirety, may be 
used to screen for peptide compounds which demonstrate binding affinity for, or the ability to 

35 modulate, the PBR/EBP, or biologically active fragments thereof. In another embodiment, 

competitive drug screening assays using neutralizing antibodies specifically compete with a test 
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compound for binding to the PBR/IBP protein of the invention, or biologically active fragments 
thereof. 

Another embodiment of the subject invention provides compositions and methods of 
selectively modulating the expression or activity of the protein of the invention. Modulation of the 
5 PBR/IBP would allow for the successful treatment and/or management of diseases or biochemical 
abnormalities associated with the PBR or PBR/IBP. Antagonists, able to reduce or inhibit the 
expression or the activity of the protein of the invention, would be useful in the treatment of 
diseases associated with elevated levels of the PBR/IBP, increased cell proliferation, or increased 
cholesterol transport. Thus, the subject invention provides methods for treating a variety of diseases 
10 or disorders, including, but not limited to, cancers, especially liver cancer, and portal-systemic 
encephalopathy. 

Alternatively, the subject invention provides methods of treating diseases or disorders 
associated with decreased levels of the protein of the PBR/IBP. Thus, the subject invention provides 
methods of treating diseases including, and not limited to, schizophrenia, chronic stress, GAD, PD, 
15 GSP and PTSD. Other diseases which may be treated by agonists of the PBR/IBP of the subject 
invention include those diseases associated with decreases in cell proliferation, e.g. developmental 
retardation. 

Furthermore, because the PBR/IBP of the subject invention is also able to transport 
cholesterol into cells, the subject invention may also be used to increase cholesterol transport into 

20 cells. Diseases associated with cholesterol transport deficiencies include lipoidal adrenal 

hyperplasia, and diseases where there is a requirement for increased production of compounds 
requiring cholesterol such as myelin and myelination, such as Alzheimer's disease, spinal chord 
injury, and brain development neuropathy [Snipes, G. and Suter, U. (1997) Cholesterol and Myelin. 
In: Subcellular Biochemistry, Robert Bittman (ed.), vol. 28, pp. 173-204, Plenum Press, New York]. 

25 The methods of treating disorders associated with decreased levels of PBR/IBP may be practiced by 
introducing agonists which stimulate the expression or the activity of the protein of the invention. 

In one embodiment, methods of increasing the levels of PBR/IBP in tissues or cell types 
may be practiced by utilizing nucleic acids encoding the protein of the subject invention, or 
biologically active fragments thereof, to introduce biologically active polypeptide into targeted cell 

30 types. Vectors useful in such methods are known to those skilled in the art as are methods of 
introducing such nucleic acids into target tissues. 

Agents which stimulate or inhibit the activity of the protein of the invention include but are 
not limited to agonist and antagonist drugs respectively. These drugs can be obtained using any of a 
variety of drug screening techniques as discussed above. 

35 Antagonists of the PBR/IBP encoded by SEQ ID NO: 170 include agents which decrease 

the levels of expressed mRNA encoding the protein of SEQ ID NO: 411. These include, but are not 
limited to, RNAi, one or more ribozymes capable of digesting the protein of the invention mRNA, 
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or antisense oligonucleotides capable of hybridizing to mRNA encoding the PBR/IBP of SEQ ID 
NO: 41 1 Antisense oligonucleotides can be administrated as DNA, as DNA entrapped in 
proteoliposomes containing viral envelope receptor proteins [Kanoda, Y. et al. (1989) Science 243: 
375] or as part of a vector which can be expressed in the target cell and provide antisense DNA or 
5 RNA. Vectors which are expressed in particular cell types are known in the art. Alternatively, the 
DNA can be injected along with a carrier. A carrier can be a protein such as a cytokine, for example 
interleukin 2, or polylysine-glycoprotein carriers. Carrier proteins, vectors, and methods of making 
and using polylysine carrier systems are known in the art. Alternatively, nucleic acid encoding 
antisense molecules may be coated onto gold beads and introduced into the skin with, for example, 
• 1 0 a gene gun [Ulmer, J.B. et al. (1 993) Science 259: 1 745]. 

Antibodies, or other polypeptides, capable of reducing or inhibiting the activity of PBR/IBP 
may be provided as in isolated and substantially purified form. Alternatively, antibodies or other 
polypeptides capable of inhibiting or reducing the activity of the PBR/IBP protein, may be 
recombinantly expressed in the target cell to provide a modulating effect. In addition, compounds 

1 5 which inhibit or reduce the activity of the PBR/IBP protein of the subject invention may be 

incorporated into biodegradable polymers being implanted in the vicinity of where drug delivery is 
desired. For example, biodegradable polymers may be implanted at the site of a tumor or, 
alternatively, biodegradable polymers containing antagonists/agonists may be implanted to slowly 
release the compounds systemically. Biodegradable polymers, and their use, are known to those of 

20 skill in the art (see, for example, Brem et al. (1991) J. Neurosurg. 74:441-446. 

In another embodiment, the invention provides methods and compositions for detecting the 
level of expression of the mRNA of the protein of the invention. Quantification of mRNA levels of 
the PBR/IBP protein of the invention may be useful for the diagnosis or prognosis of diseases 
associated with an altered expression of the protein of the invention. Assays for the detection and 

25 quantification of the mRNA of the protein of the invention are well known in the art (see, for 
example, Maniatis, Fitsch and Sambrook, Molecular Cloning; A Laboratory Manual (1982), or 
Current Protocols in Molecular Biology, Ausubel, F.M. et al. (Eds), Wiley & Sons, Inc.). 

Polynucleotides probes or primers for the detection of the mRNA of the protein of SEQ ID 
NO: 41 1 can be designed from the cDNA of SEQ ID NO: 170. Methods for designing probes and 

30 primers are known in the art. In another embodiment, the subject invention provides diagnostic kits 
for the detection of the mRNA of the protein of the invention in cells. The kit comprises a package 
having one or more containers of oligonucleotide primers for detection of the protein of the 
invention in PCR assays or one or more containers of polynucleotide probes for the detection of the 
mRNA of the protein of the invention by in situ hybridization or Northern analysis. Kits may, 

35 optionally, include containers of various reagents used in various hybridization assays. The kit may 
also, optionally, contain one or more of the following items: polymerization enzymes, buffers, 
instructions, controls, or detection labels. Kits may also, optionally, include containers of reagents 



368 



WO 01/42451 PCT/IB00/01938 

mixed together in suitable proportions for performing the hybridization assay methods in 
accordance with the invention. Reagent containers preferably contain reagents in unit quantities that 
obviate measuring steps when performing the subject methods. 

In another embodiment, the invention relates to methods and compositions for detecting and 
5 quantifying the level of the protein of the invention present in a particular biological sample. These 
methods are useful for the diagnosis or prognosis of diseases associated with an altered levels of the 
protein of the invention. Diagnostic assays to detect the protein of the invention may comprise a 
biopsy, in situ assay of cells from organ or tissue sections, or an aspirate of cells from a tumor or 
normal tissue. In addition, assays may be conducted upon cellular extracts from organs, tissues, 

10 cells, urine, or serum or blood or any other body fluid or extract. 

Assays for the quantification of the PBR/ffiP of SEQ ID NO: 41 1 may be performed 
according to methods well known in the art. Typically, these assays comprise contacting the sample 
with a ligand of the protein of the invention or an antibody (polyclonal or monoclonal) which 
recognizes the protein of the invention or a fragment thereof, and detecting the complex formed 

15 between the protein of the invention present in the sample and the ligand or antibody. Fragments of 
the ligands and antibodies may also be used in the binding assays, provided these fragments are 
capable of specifically interacting with the BRP/IRP of the subject invention. Further, the ligands 
and antibodies which bind to the BRP/IRP of the invention may be labeled according to methods 
known in the art. Labels which are useful in the subject invention include, but are not limited to, 

20 enzymes labels, radioisotopic labels, paramagnetic labels, and chemiluminescent labels. Typical 
techniques are described by Kennedy, J. H., et al. (1976) Clin. Chim. Acta 70:1-31; and Schurs, A. 
H. et al. (1977) Clin. Chim. Acta 81 : 1-40. 

The subject invention also provides methods and compositions for the identification of 
metastatic tumor masses. In this aspect of the invention, the polypeptides and antibodies which 

25 bind the polypeptides of the invention may be used as a marker for the identification of the 

metastatic tumor mass. Metastatic tumors which originated from the liver may overexpress the 
PBR/IBP of SEQ ID NO: 411, whereas newly forming tumors, or those originating from other 
tissues are not expected to bear the PBR/IBP of SEQ ID NO: 411. 

Protein of SEP ID NO: 397 (internal designation 160-28-4-0-C4-CSY 

30 The protein of SEQ ID NO: 397, encoded by the cDNA of SEQ ID NO: 156 (clone 160-28- 

4-0-C4-CS), exhibits homology to the ADP-ribosylation factors (ARF) family of proteins. The 
ARF family includes ADP-ribosylation factors (ARFs) and ARF-like proteins (ARLs); the ARF 
family of proteins is one family of the Ras superfamily. Proteins belonging to the Ras superfamily 
have molecular weights of 18-30 kDa and function in a variety of cellular processes including, but 

35 not limited to, signaling, growth, immunity, and protein transport. 
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ARFs are monomeric GTP-binding proteins, related structurally to both G protein alpha- 
subunits and Ras proteins. ARF family members share more than 60% sequence identity, appear to 
be ubiquitous in eukaryotes, and are evolutionarily highly conserved throughout. Immunologically, 
they have been localized to the Golgi apparatus of several types of cells (Stearns et al. Proc. Natl. 
5 Acad. Sci. (USA) 87: 1238-1242 (1990)). ARF proteins enhance the ADP-ribosyltransferase 
activity of cholera toxin as an allosteric activator (Noda et al. Biochim. Biophys. Acta 1034: 195- 
199 (1990)). ARFs have also been shown to act as regulatory molecules, or "switches", for linking 
two processes (e.g., the process of vesicle fission from a donor compartment and fusion with an 
acceptor compartment (Rothman, J. E. and Wieland, F. T. Science 272: 227-234 (1996)). ARF 
10 family members fall into three classes, classes I-III, according to their size and sequence homology. 
Class I comprises ARF1, ARF2, and ARF3; Class II comprises ARF4 and ARFS; and Class III 
comprises ARF6. 

The classes occupy different subcellular locations and have been implicated in different 
transport pathways. Class I ARFs localize to the Golgi where they are involved in the regulation of 

15 ER-Golgi and intra-Golgi transport. Class I ARFs are also involved in the recruitment of cytosolic 
coat proteins to Golgi membranes during the formation of transport vesicles. Class III (e.g., ARF6) 
localizes to a tubulovesicular compartment, secretory granules, and the plasma membrane, where it 
is involved in regulated secretion and recycling. Class II ARFs appear to be cytosolic, but their role 
has not been elucidated. (Radhakrishna, H. and Donaldson, J. G. J. Cell Biol. 139: 49-61(1997)). 

20 ARF function, in general, is regulated by a GDP-GTP cycle. For example, ARF1 is 

cytosolic in the GDP bound state, but is associated with membranes when in the GTP bound state. 
A guanine nucleotide exchange factor (GEF) in the donor compartment recruits ARF1 to the 
membrane. At the membrane, GTP- ARF 1 recruits coat proteins, which assemble together into 
spherical coats, budding off vesicles in the process. After budding, hydrolysis of bound GTP causes 

25 ARF1 to dissociate from the membrane. ARF1 dissociation causes the coat to become unstable and 
dissociate as well. (Rothman, supra.) 

Members of the ARF multigene family, when expressed as recombinant proteins in E. coli, 
display different phospholipid and detergent requirements (Price, et al. J. Biol. Chem. 267: 17766- 
17772 (1992)). Some lipids and/or detergents, e.g., SDS, cardiolipin, 

30 dimyristoylphosphatidylcholine (DMPC)/cholate, enhance ARF activities (Bobak, et al. 

Biochemistry 29:855-861 (1990); Noda, et al. Biochim. Biophys. Acta 1034: 195-199 (1990); Tsai, 
et al. J. Biol. Chem. 263:1768-1772 (1988)). ARFs also activate phospholipase D (PLD), a 
membrane-bound enzyme implicated as an effector of several growth factors (Boman, A. L. and 
Kahn, R. A. Trends Biochem. Sci. 20: 147-150 (1995). PLD1 has been shown to be activated by a 

35 variety of G-protein regulators, for example, PKC (protein kinase C) and ADP-ribosylation factor 
(ARF). PKC and ARFs may regulate G-proteins either individually or together in a synergistic 
manner. Recently the role of ARFs in microtubules formation has also been demonstrated. ADP- 
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ribosylation of tubulin almost completely blocked self-assembly of this protein in brain (Terashima 
M. et a; J.Nutr Sci Vitaminol 45: 393-400 (1999)). 

In general, differences in the various ARF sequences are concentrated in the amino-terminal 
regions and the carboxyl portions of the proteins. Only three of 17 amino acids in the amino termini 
5 have shown to be identical among ARFs, and four amino acids in this region of ARFs 1-5 are 

missing in ARF 6 (Tsuchiya, et al. J. Biol. Chem. 266: 2772-2777 (1991)). It was reported (Kahn, et 
al. J. Biol. Chem. 267:13039-13046 (1992)) that the amino-terminal regions of ARF proteins form 
an alpha-helix and that this domain is required for membrane targeting, interaction with lipid, and 
ARF activity. 

10 Schliefer et ah, (J. Biol. Chem. 257: 20-23 (1991)) have described a protein distinctly larger 

than ARF that possessed ARF-like activity. ARF -like proteins, or ARLs, have been found in 
different species. Some of ARLs appear to lack ADP-ribosyltransferase-enhancing activity; ARLs 
may differ in GTP-binding requirements and GTPase activity as compared to various ARF 
isoforms. For example, ARP, a mammalian ARL, is 33-39% identical to members of the ARF 

1 5 family; ARP, however, differs from other ARF family proteins by virtue of its ability to hydrolyze 
bound GTP in the absence of other proteins. ARP protein, unlike ARFs, is typically associated with 
plasma membrane instead of the cytosol (Schurmann, A. J. Biol. Chem. 270, 30657-30663 (1995)). 

ARF family members have been implicated in several disease processes, such as Lowe's 
syndrome, an X-linked disorder characterized by congenital cataracts, renal tubular dysfunction and 

20 neurological deficits. These disorders may be due to an inability to recruit ARF to the Golgi 

membrane (Suchy, S. F. et al. Hum. Mol. Genet. 4: 2245-2250 (1995), Londono I. et al. Kidney Int. 
55: 1407-1416 (1999)). It has also been suggested that regulation of ARF is also involved in cystic 
fibrosis, Dent's disease, diabetes, and autosomal dominant polycystic kidney disease (Marshansky, 
V., et al. Electrophoresis 18: 2661-2676 (1997)). 

25 The new human ARF-related protein of SEQ ID NO:397, encoded by clone 160-28-4-0-C4- 

CS in one embodiment, and the related polynucleotides, provide new compositions which are useful 
in the diagnosis, treatment, and prevention of secretory, exocytosis, endocytosis and another 
"sorting disorders." 

The subject invention provides a polypeptide comprising the amino acid sequence of SEQ 
30 ID NO: 397 or clone 160-28-4-0-C4-CS, or biologically active fragments thereof. The intact protein 
of interest is 173 amino acids in length, has an ARF family amino acid motif (Pfam), and has 
ATP/GTP-binding site motif A P-loop (PS00017). The protein of SEQ ID NO: 397 or clone 160- 
28-4-0-C4-CS also has chemical and structural similarity with human ARL1 (P40616), ARD-1 
(R66033) and ARF6 (GI 178989) (31%, 31% and 27% identity, respectively). The amino acid 
35 length of SEQ ID NO: 397 is similar to those of the aforementioned ARFs Biologically active 

fragments of SEQ ED NO: 397 have one or more of the biological activities typically associated the 
full length protein. In one embodiment, the protein is encoded by clone 160-28-4-0-C4-CS 
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The invention also provides variants of the protein of SEQ ID NO: 397 or clone 160-28-4- 
0-C4-CS. The variants have at least about 80%, more preferably at least about 90%, and most 
preferably at least about 95% amino acid sequence identity to the amino acid sequence of SEQ ID 
NO: 397 or clone 160-28-4-0-C4-CS. Variants according to the subject invention have at least one 
5 functional and/or structural characteristic of ARFs. The invention also provides biologically active 
fragments of the variant proteins. 

The invention includes those polynucleotides encoding the protein of SEQ ID NO: 397 or 
clone 160-28-4-0-C4-CS, variants of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, and biologically 
active fragments of both the protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS and variants 

10 thereof. As is apparent to those skilled in the art, a variety of different DNA sequences can encode 
the amino acid sequence of the proteins, variants, and biologically active fragments of said proteins 
and variants. It is well within the skill of a person trained in the art to create these alternative DNA 
sequences encoding proteins having the same, or essentially the same, amino acid sequence. These 
variant DNA sequences are also within the scope of the subject invention. As used herein, 

15 reference to "essentially the same" sequence refers to sequences that have amino acid substitutions, 
deletions, additions, or insertions that do not materially affect biological activity. 

The subject invention provides method of treating cytoskeletal, secretory, and inflammatory 
disorders/conditions comprising the administration of therapeutically effective amounts of a 
composition comprising the protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS. These 

20 methods can also be practiced using variants of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, or 
biologically active fragments of either SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, or variants of 
SEQ ID NO: 397 or clone 160-28-4-0-C4-CS. Disorders/conditions which can be treated by the 
subject invention include, but are not limited to, prostate cancer, brain and another tumors, Lowe's 
syndrome, glomerulonephritis, chronic glomerulonephritis, tubulointerstitial nephritis, inherited X- 

25 linked nephrogenic diabetes insipidus, autosomal dominant polycystic kidney disease (ADPKD), 
herpes gestationis, dermatitis herpetiformis, lupus erythematosus, Crohn's disease, irritable bowel 
syndrome and Addison's disease; secretory/endocytotic disorders such as cystic fibrosis, glucose- 
galactose malabsorption syndrome, hypercholesterolemia, hyper- and hypoglycemia, Grave's 
disease, goiter, and Cushing's disease; conditions associated with abnormal vesicle trafficking, 

30 including acquired immunodeficiency syndrome (AIDS); allergies including hay fever, asthma, and 
urticaria (hives); autoimmune hemolytic anemia; multiple sclerosis; myasthenia gravis; rheumatoid 
and osteoarthritis; Chediak-Higashi and Sjogren's syndromes; toxic shock syndrome; traumatic 
tissue damage; viral, bacterial, fungal, helminthic, and protozoal infections. 

In another embodiment, a vector capable of expressing the protein of SEQ ID NO: 397 or 

35 clone 160-28-4-0-C4-CS, or biologically active fragments thereof, can be administered to a subject 
to treat or prevent disorders including, but not limited to, those described above. Alternatively, the 
vector can encode a variant, or biologically active fragment of the variant protein. Multiple vectors 
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encoding any combination of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, variants, and/or 
biologically active fragments of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS and/or variants can 
be administered to a subject. 

In a further embodiment, a pharmaceutical composition comprising a substantially purified 
5 protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments 
thereof), in conjunction with a suitable pharmaceutical carrier, can be administered to a subject to 
treat or prevent the above mentioned disorders. Alternatively, a pharmaceutical composition 
comprising a substantially purified variant protein of SEQ ED NO: 397 or clone 160-28-4-0-C4-CS 
(and/or biologically active fragments thereof), in conjunction with a suitable pharmaceutical carrier, 

10 can be administered in the aforementioned therapeutic regimens. As would be apparent to the 
skilled artisan, any therapeutically effective combination of the protein encoded by SEQ ID NO: 
397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments thereof) and variants of SEQ 
ID NO:397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments thereof), in 
conjunction with a suitable pharmaceutical carrier can be used in the aforementioned therapeutic 

15 regimens. 

ARFs are known to be involved in regulated transport of vesicles. Therefore, in another 
embodiment, the protein of SEQ ID No: 397 or clone 160-28-4-0-C4-CS, variants, and/or 
biologically active fragments of said proteins and/or variants can be used as a component of drug 
delivery vehicles such as colloids or liposomes. The protein of SEQ ID NO: 397 or clone 160-28-4- 

20 0-C4-CS, variants, and/or biologically active fragments of said proteins and/or variants can be 

incorporated into the lipid membranes of liposomes and can serve as specific targeting agents. The 
methods of design of such drug delivery systems is known by those skilled in the art and can be 
practiced according to conventional pharmaceutical principles (Smith HJ. Introduction to the 
principles of drug design and action, 3 rd ed. (1998); Chien Y.W. Novel Drug Delivery systems, 2 nd 

25 ed. (1992); Storm G. et al J.Liposome Res. 4: 641-666 (1994); and Crommelin D.J.A. et al. Adv. 
Drug Delivery Rev. 17 : 49-60 (1995)). 

In another embodiment of the invention, the polynucleotides encoding the protein of SEQ 
ID NO: 397 or clone 160-28-4-0-C4-CS can be used for therapeutic purposes. Polynucleotides 
encoding fragments of the protein of SEQ ID NO:397 or clone 1 60-28-4-0-C4-CS can also be used 

30 in therapeutic regiments. In one aspect, the complement of the polynucleotide encoding the protein 
of SEQ ID NO.: 397 or clone 160-28-4-0-C4-CS can be used in situations in which it would be 
desirable to block the transcription of the mRNA. Modifications of gene expression can be obtained 
by designing complementary sequences or antisense molecules (DNA, RNA, or PNA) to the 
control, 5', or regulatory regions of the gene encoding the protein of interest. Such technology is 

35 now well known in the art, and sense or antisense oligonucleotides or larger fragments can be 
designed from various locations along the coding or control regions of sequences encoding the 
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protein of interest. Methods of treatment utilizing antisense technology are also well known to 
those skilled in the art. 

Another embodiment of the invention provides methods of assessing PLD modulation by 
using ARF properties of the protein of interest. 
5 In another embodiment, antibodies which specifically bind the protein of SEQ ID NO: 397 

or clone 160-28-4-0-C4-CS can be used for the diagnosis of disorders characterized by expression 
of the protein, or in assays to monitor patients being treated with the protein of interest. Methods of 
making both polyclonal and monoclonal antibodies are well-known in the art. Diagnostic assays 
which can be used in this aspect of the invention include, and are not limited to, ELISAs, RIAs, and 

10 FACS, and are well known in the art. These assays also provide a basis for diagnosing or 
identifying altered or abnormal levels of SEQ ED NO:397 or the polypeptides encoded by the 
human cDNA of clone 160-28-4-0-C4-CS expression as compared to normal individuals. These 
screening methods are, likewise, well known to the skilled artisan. 

In another embodiment of the invention, the protein of interest, its catalytic or immunogenic 

1 5 fragments, or oligopeptides thereof can be used for screening libraries of compounds in any of a 
variety of drug screening techniques. The fragment employed in such screening can be free in 
solution, affixed to a solid support, recombinantly expressed on, or chemically attached to, a cell 
surface, or located intracellularly. The formation of binding complexes between the protein of 
interest and the agent being tested can be measured by methods well known to those skilled in the 

20 art. Another technique for drug screening provides for high throughput screening of compounds 
having suitable binding affinity to the protein of interest. (See, e.g., Geysen, et al. (1984) PCT 
application WO84/03564.) 

In another embodiment of the invention, the polynucleotides encoding the protein of interest 
can be used for diagnostic purposes. The polynucleotides can be used to detect and quantify gene 

25 expression in biopsied tissues in which expression of the protein of interest can be correlated with a 
disease or condition. Such diagnostic assays are well known in the art and can be used to monitor 
regulation of the protein of interest levels during therapeutic intervention and/or to determine 
absence, presence, and excess expression of the protein of interest. Examples of such conditions 
and disorders have been provided supra. The polynucleotide sequences encoding the protein of 

30 interest can be used, for example, in Southern or Northern analyses, dot blot, or other membrane- 
based technologies; in PCR technologies; in dipstick, pin, and ELISA assays; and in microarrays 
utilizing fluids or tissues from patients to detect altered expression of the protein of SEQ ID 
NO:397 or clone 160-28-4-0-C4-CS. Such qualitative or quantitative methods are well known in the 
art. 

35 In further embodiments, oligonucleotides or longer fragments derived from any of the 

polynucleotide sequences described herein can be used as targets in a microarray. The microarray 
can be used to monitor the expression level of large numbers of genes simultaneously and to 
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identify genetic variants, mutations, and polymorphisms. This information can be used to determine 
gene function, to understand the genetic basis of a disorder, to diagnose a disorder, and to develop 
and monitor the activities of therapeutic agents. Microarrays can be prepared, used, and analyzed 
using methods known in the art. (See, e.g., Brennan, T. M. et al. (1995) U.S. Pat. No. 5,474,796; 
5 Heller, R. A. et al. (1997) Proc. Natl. Acad. Sci. 94: 2150-2155; and Heller, M. J. et al. (1997) U.S. 
Pat. No. 5,605,662.) 

Another embodiment of the subject invention provides nucleic acid sequences encoding the 
protein of interest which can be extended utilizing a partial nucleotide sequence and various PCR- 
based methods. This aspect of the invention provides methods for the detection of upstream 

10 sequences, such as promoters and regulatory elements. Methods of practicing this aspect of the 
invention are also well known in the art. 

In other embodiments of the disclosed therapeutic regimens, any of the proteins, variants, 
biologically active fragments, antibodies, complementary sequences, or vectors of the invention can 
be administered in combination with other appropriate therapeutic agents. Selection of the 

15 appropriate agents for use in combination therapy can be made by one of ordinary skill in the art. 
The combination of therapeutic agents can act synergistically to effect the treatment or prevention 
of the various disorders described above. In particular, purified protein can be used to produce 
antibodies or to screen libraries of pharmaceutical agents to identify those which specifically bind 
the protein of interest. Neutralizing antibodies especially preferred for therapeutic use. 

20 Protein of SEP ID NO: 287 (internal designation 174-5-3-0-H7-CS) 

The protein of SEQ ID NO: 287, encoded by human cDNA of SEQ ID NO: 46 (clone 174- 
5-3-0-H7-CS), is highly homologous (more than 99% identity in amino acids) to the human protein 
encoded by the CLN8 gene listed in Genbank under accession number AF 123757. The two 
proteins differ by two conservative amino-acid substations (alanine for valine at position 155 and 

25 serine for asparagine at position 225). In addition, the protein encoded by 174-5-3-0-H7-CS 

contains seven transmembrane domains. These domains are located at amino acids 25-45, 71 -91 , 
100-120, 133-153, 160-180, 205-225, and 228-248 as predicted by the software TopPred II (Claros 
and von Heijne, CABIOS applic. Notes, 10:685-686 (1994)). The protein encoded by SEQ ID 
NO: 287 also exhibits a signal peptide at positions 1-50 and a retention signal KKRP from positions 

30 283 to 286. 

CLN8 was identified recently by positional cloning (Ranta et al., Nat Genet. 1999 
Oct.;23(2):233-6). CLN8 encodes a 286 amino-acid putative transmembrane protein with no 
homology to previously known proteins. A naturally-occurring missense mutation in codon 24 
(R24G at the border of the first putative transmembrane domain) is the molecular basis for EPMR 
35 ("progressive epilepsy with mental retardation", MIM 600143). EPMR, also called Northern 

Epilepsy, is an autosomal recessive disorder characterized by normal early development, onset of 
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generalized tonic-clonic seizures between the ages of 5 and 10 years, and subsequent progressive 
mental retardation. Neuropathological findings have shown that EPMR is a new member of the 
neuronal ceroid lipofuscinosis (NCL) group of neurodegenerative disorders. The NCLs are a 
genetically heterogeneous group of progressive neurodegenerative disorders characterized by the 
5 accumulation of autofluorescent lipopigment in various tissues. CLN8 is the eighth gene to be 
linked to the NCL group of neurodegenerative disorders. 

Subsequently, the homologous mouse gene (Cln8) was sequenced (82% nucleotide identity 
with the human gene) and localized to the region of the mouse genome linked to motor neuron 
degeneration, mouse mnd. Mnd is a naturally-occurring mouse mutant with intracellular 
10 autofluorescent inclusions similar to those seen in EPMR. A mutation in mnd mouse DNA was 
identified, indicating that mnd is a murine ortholog for CLN8 (Ranta et ah, Nat Genet. 1999 
Oct;23(2):233-6), and that mice containing mutations in Cln8 represent a murine model for NCL 
disorders. 

Recent experimental evidence has confirmed the transmembrane nature of the CLN8 

15 protein (Lonka L et al., Hum Mol Genet. 2000 Jul 1;9(1 1):1691-7). CLN8 resides in the 
endoplasmic reticulum (ER) and recycles between the ER and the ER-Golgi intermediate 
compartment (ERGIC) via a KKXX ER-retrieval motif at its C-terminus (KKRP, amino-acids 283- 
286). This motif is recognized and bound by COPI, a vesicle-coating protein found in retrograde 
vesicles delivering cargo from the cis Golgi to the ER. The 30kD CLN8 protein is not processed 

20 during its maturation (in particular it is not N-glycosylated). The EPMR-associated R24G mutation 
does not alter cellular localization in humans. 

The subject invention provides a polypeptide encoded by SEQ ID NO: 287 and biologically 
active fragments of said polypeptide. Compositions comprising polypeptides and pharmaceutic ally 
acceptable carriers are likewise provided. Preferred polypeptides, and biologically active fragments 

25 thereof, have any of the biological activities or domains/motifs described herein and/or contain the 
amino acids of positions 155 and 225, 283 to 286. In one embodiment, the protein/polypeptide of 
SEQ ID NO: 287 is encoded by clone 174-5-3-0-H7-CS. 

The ER/ERGIC cellular localization of protein of this invention can be used to target 
compounds to the ER/ERGIC. This targeting can be observed using any of the techniques known to 

30 those skilled in the art including those described in (Lonka L et al., Hum Mol Genet. 2000 Jul 
1;9(1 1):1691-7). In this aspect of the invention, the protein of SEQ ID NO: 287, or biologically 
active fragments thereof can be used to target liposomes, vesicles, or colloids to the ER/ERGIC 
compartment where active agents can be delivered. Methods of making and using targeted 
liposomes are well known in the art. 

35 In another embodiment, liposomes comprising the protein of SEQ ID NO: 287 can contain a 

second targeting agent for the specific selection of a target cell. The second targeting agent can be 
selected for its ability to specifically target a cell or tissue. Thus, the second targeting agent can be 
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specific for tumor markers, such as HER2. Alternatively, markers associated with specific cell 
types can be used (e.g., CD34, CD4, CD8, etc.). In a preferred embodiment, the second targeting 
agent is an antibody. Active agents include, but are not limited to, chemotherapeutic agents protein 
cross-linking agents, inhibitors of protein synthesis, anti -bacterial agents (e.g., antibiotics), antiviral 
5 agents, and/or anti-parasitic agents. The ability to bind the COPI coatomer can be assayed as 
described in (Cosson P, Letourneur F, Science. 1994 Mar 18;263(5153):1629-31). 

In another embodiment, the present invention provides methods of, and compositions for, 
identifying specific cellular compartments, such as the ER, ERGIC, and retrograde transport 
vesicles. This embodiment provides antibodies which specifically bind the protein of SEQ ID 

10 NO: 287, or biologically active fragments thereof, which are labeled with detectable markers, such 
as gold particles, enzymes, radioisotopes, or paramagnetic labels. ER, ERGIC, and retrograde 
transport vesicles can be identified in samples according to well-known immuno-diagnostic 
protocols. The antibodies, either monoclonal or polyclonal, can be made according to well-known 
methods. In a preferred embodiment, the antibodies bind to ER retention signal. 

1 5 In another embodiment, the protein of the invention or part thereof can be used as a reagent 

for differential identification of the tissue(s) or cell type(s) present in a biological sample and for 
diagnosis of diseases and conditions, which include, but are not limited to, asthma, pulmonary 
edema, atherosclerosis, restenosis, stroke potential, thrombosis and hypertension. Similarly, the 
protein of the invention, or biologically active fragments thereof, and antibodies thereto can provide 

20 immunological probes for differential identification of the tissue(s) or cell type(s). In a number of 
disorders listed above, particularly of the pulmonary and cardiovascular systems, expression of this 
protein at significantly higher or lower levels can be routinely detected in certain tissues or cell 
types (e. g., vascular tissues, cancerous and wounded tissues) or bodily fluids (e. g., lymph, serum, 
plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 

25 individual having such a disorder, relative to the standard gene expression level, i.e., the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 

Indeed, the 80 first amino-acids of the protein of the invention are identical to two 
polypeptides claimed in Patent WO 99/35158, hereby incorporated by reference in its entirety (SEQ 
ID NO:98 and SEQ ID NO: 162 corresponding to Geneseq accession numbers Y38413/Y38428 and 

30 Y38492) are over-expressed in pulmonary and endothelial tissues. 

The tissue distribution in pulmonary and endothelial tissues indicates that the protein 
product described in WO 99/35158 is useful for the treatment and diagnosis of cardiovascular and 
respiratory or pulmonary disorders such as asthma, pulmonary edema, pneumonia, atherosclerosis, 
restenosis, stroke, angina, thrombosis hypertension, inflammation, and wound healing. Those 

35 conditions can be diagnosed by determining the amount of the protein of the invention in a sample. 
Thus, antibodies raised against the protein of SEQ ID NO: 287, or an immunogenic fragment of the 
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protein can be used in diagnostic, prognostic, or screening assays such as those taught in WO 
99/35158. 



Protein of SEP ID No. 270 (internal designation 1 16-1 19-3-0-H5-CS) 

The protein of SEQ ID NO: 270 encoded by the extended cDNA SEQ ID NO: 29 is 
5 homologous to the human mitochondrial ATP synthase f subunit or ATPK (E.C. 3.6.1 .34) (Swissprot 
accession number P56134) and is overexpressed in fetal kidney. 

The protein of SEQ ID NO: 270, composed of 88 amino acid residues, contains 1 
transmembrane segment (position 1 to 55) predicted by the software TopPred II (Claros and von 
Heijne, CABIOS applic. Notes, 10 :685-686 (1994). BLAST results show that 100% homology is found 

10 between amino acids 5 to 88 of the protein of the invention and amino acids 10 to 93 of human ATP 
synthase f chain (93 amino acids total), exon 1 of the cDNA SEQ ID NO: 29 making the difference 
between the 2 proteins (the last 3 exons show 100% homology). Thus, the protein of the invention 
represents a new isoform of human mitochondrial ATP synthase f subunit. It is interesting to note that 
the same splice variant is found in bovin, pig and mouse species. 

1 5 The mitochondrial electron transport (or respiratory) chain is a series of enzyme complexes in 

the mitochondrial membrane that is responsible for the transport of electrons from NADH to oxygen 
and the coupling of this oxidation to the synthesis of ATP (oxidative phosphorylation). ATP then 
provides the 

primary source of energy for driving a cell's many energy-requiring reactions. ATP synthase 

20 (F0 Fl ATPase) is the enzyme complex at the terminus of this chain and serves as a reversible coupling 
device that interconverts the energies of an electrochemical proton gradient across the mitochondrial 
membrane into either the synthesis or hydrolysis of ATP. This gradient is produced by other enzymes 
of the respiratory chain in the course of electron transport from NADH to oxygen. When the cell's 
energy demands are high, electron transport from NADH to oxygen generates an electrochemical 

25 gradient across the mitochondrial membrane. Proton translocation from the outer to the inner side of the 
membrane drives the synthesis of ATP. Under conditions of low energy requirements and when there is 
an excess of ATP present, this electrochemical gradient is reversed and ATP synthase hydrolyzes ATP. 
The energy of hydrolysis is used to pump protons out of the mitochondrial matrix. ATP synthase is, 
therefore, a dual complex, the F0 portion of which is a transmembrane proton carrier or pump, and the 

30 Fl portion of which is catalytic and synthesizes or hydrolyzes ATP. Mammalian ATP synthase 
complex consists of sixteen different polypeptides (Walker, J. E. and Collinson, T. R. (1994) FEBS 
Lett.346: 39-43). Six of these polypeptides (subunits alpha, beta, gamma, delta, epsilon, and an ATPase 
inhibitor protein IF 1) comprise the globular catalytic F 1 ATPase portion of the complex, which lies 
outside of the mitochondrial membrane. The remaining ten polypeptides (subunits a, b, c, d, e, f, g, F6, 

35 OSCP, and A6L) comprise the proton-translocating, membrane spanning F0 portion of the complex. 
Like other members of the respiratory chain, all but two of the polypeptide subunits of ATP synthase 
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are nuclear gene products that are imported into the mitochondria. Enzyme complexes similar to 
mammalian ATP synthase are found in all cell types and in chloroplast and bacterial membranes. This 
universality indicates the central importance of this enzyme to ATP metabolism. Transcriptional 
regulation of these nuclear encoded genes appears to be the predominant means for controlling the 
5 biogenesis of ATP synthase. Multiple mitochondrial pathologies exist because of the essential role of 
mitochondrial oxidative phosphorylation in cellular energy production, in the generation of reactive 
oxygen species and in the initation of apoptosis (Wallace, Science, 283:1482-1488, 1999). It is now 
clear that mitochondrial diseases encompass an assemblage of clinical problems commonly involving 
tissues that have high energy requirements such as heart, muscle and the renal and endocrine systems. 

10 Over the past 1 1 years, a considerable body of evidence has accumulated implicating defects in the 
mitochondrial energy-generating pathway, oxidative phosphorylation, in a wide variety of degenerative 
diseases including myopathy and cardiomyopathy. Most classes of pathogenic mitochondrial DNA 
mutations affect the heart, in association with a variety of other clinical manifestations that can include 
skeletal muscle, the central nervous system (including eye), the endocrine system, and the renal system. 

1 5 Nuclear mutations causing mitochondrial disorders have been described. They are often found in highly 
conserved subunits. Mitochondrial disorders with nuclear mutations include : myopathies (PEO, 
MNGIE, congenital muscular dystrophy, carnitine disorders), encephalopathies (Leigh, Infantile, 
Wilson's disease, Deafhess-Dystonia syndrome), other systemic disorders and cardiomyopathies. 

The discovery of a new ATP synthase subunit, and polynucleotides encoding it satisfy a need 

20 in the art by providing new compositions which are useful for the diagnosis, prevention, and treatment 
of cancer, myopathies, immune disorders, and neurological disorders. 

It is believed that the protein of SEQ ID NO: 270 or part thereof plays a role in cellular 
respiration, preferably as a mitochondrial ATP synthase subunit. Preferred polypeptides of the 
invention are fragments of SEQ ID NO: 270 having any of the biological activity described herein. 

25 An object of the present invention are compositions and methods of targeting heterologous 

compounds, either polypeptides or polynucleotides to mitochondria by recombinantly or chemically 
fusing a fragment of the protein of the invention to an heterologous polypeptide or polynucleotide. 
Preferred fragments are signal peptide, amphiphilic alpha helices and/or any other fragments of the 
protein of the invention, or part thereof, that may contain targeting signals for mitochondria 

30 including but not limited to matrix targeting signals as defined in Herrman and Neupert, Curr. 
Opinion Microbiol. 3:210-4 (2000); Bhagwat et al. J. Biol. Chem. 274:24014-22 (1999), Murphy 
Trends Biotechnol. 15:326-30 (1997); Glaser et al. Plant Mol Biol 38:31 1-38 (1998); Ciminale et al. 
Oncogene 18:4505-14 (1999). Such heterologous compounds may be used to modulate 
mitochondria's activities. For example, they may be used to induce and/or prevent mitochondrial- 

35 induced apoptosis or necrosis. In addition, heterologous polynucleotides may be used for 
mitochondrial gene therapy to replace a defective mitochondrial gene and/or to inhibit the 
deleterious expression of a mitochondrial gene. 
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The invention further relates to methods and compositions using the protein of the invention 
or part thereof to diagnose, prevent and/or treat several disorders in which mitochondrial respiratory 
electron transport chain is impaired, including but not limited to mitochondriocytopathies, necrosis, 
aging, myopathies, cancer and neurodegenerative diseases such as Alzheimer's disease, 
5 Huntington's disease, Parkinson's disease, epilepsy, Down's syndrome, dementia, multiple sclerosis, 
and amyotrophic lateral sclerosis. For diagnostic purposes, the expression of the protein of the 
invention could be investigated using any of the Northern blotting, RT-PCR or immunoblotting 
methods described herein and compared to the expression in control individuals. For prevention 
and/or treatment purposes, the protein of the invention may be used to enhance electron transport 
10 and increase energy delivery using any of the gene therapy methods described herein or known to 
those skilled in the art. 

In another embodiment, The invention further relates to methods and compositions using the 
protein of the invention or part thereof to diagnose, prevent and/or treat several disorders in which 
mitochondrial respiratory electron transport chain needs to be impaired, including but not limited to 
1 5 Sjogren's syndrome, Addison's disease, bronchitis, dermatomyositis, polymyositis, glomerulonephritis, 
diabetes mellitus, emphysema, Graves' disease, atrophic gastritis, lupus erythematosus, myasthenia 
gravis, multiple sclerosis, autoimmune thyroiditis, ulcerative colitis, anemia, pancreatitis, scleroderma, 
rheumatoid and osteoarthritis, asthma, allergic rhinitis, atopic dermatitis, dermatomyositis, 
polymyositis, and gout, using any techniques known to those skilled in the art including the antisense or 
20 triple helices strategies described herein. 

Moreover, antibodies to the protein of the invention or part thereof may be used for 
detection of mitochondria organelles and/or mitochondrial membranes using any techniques known 
to those skilled in the art. 

Protein of SEP ID NO: 271 (internal designation 1 17-00 1-5-0-G3-CS) 

25 The protein of SEQ ED NO: 271 is homologous to the family of lipopolysaccharide (LPS) 

binding proteins (LBPs). Several families of proteins have the ability to bind LPS including (a) the 
lipopolysaccharide-binding proteins (LBPs), and (b) the bactericidal permeability-increasing 
proteins (BPIs). Cholesteryl ester transfer protein (CETP), which is involved in the transfer of 
insoluble cholesteryl esters in reverse cholesterol transport, shares some homology to members of 

30 the LPS binding family of proteins. 

Lipopolysaccharide (LPS), alternatively known as bacterial endotoxin, is a major component of 
the outer membrane of Gram-negative bacteria. It consists of serotype-specific O-side chain 
polysaccharides linked to a core oligosaccharide and Lipid A. LPS is a potent mediator of the 
inflammatory response and stimulates the expression of many pro-inflammatory and pro-coagulant 

35 compounds in monocytes, macrophages, and endothelial cells. While these responses are important in 
containing and eliminating localized infections, systemic exposure to LPS can lead to a number of 
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adverse effects. These include: (a) induction of an inflammatory cascade, (b) damage to the 
endothelium, (c) widespread coagulopathies, and (d) organ damage. 

Systemic exposure to LPS can arise from direct infection by Gram-negative bacteria, leading to 
the complications of Gram-negative sepsis. Examples of diseases which are associated with Gram- 
5 negative bacterial infection or endotoxemia (including bacterial meningitis, neonatal sepsis, cystic 
fibrosis, inflammatory bowel disease, and liver cirrhosis), Gram-negative pneumonia, Gram-negative 
abdominal abscess, hemorrhagic shock, and disseminated intravascular coagulation. Subjects who are 
leukopenic or neutropenic, including subjects treated with chemotherapy or immunocompromised 
subjects, are particularly susceptible to bacterial infection and the subsequent effects of endotoxin 
10 exposure. 

Gram-negative sepsis remains one of the primary causes of severe systemic inflammation in 
hospitalized and immunocompromised patients. Alternatively, changes in gut permeability by a variety 
of circumstances, including trauma, can lead to translocation of bacteria/LPS into the bloodstream. 
Bacteria translocated from the gut is thought to play a major role in post-surgical immunosuppression 

15 (Little et al., Surgery. 1 14: 87-91 (1993)) and hemorrhagic shock. Therefore, there is a great interest to 
characterize proteins involved in the biological response to LPS and to discover therapies that can 
counteract the effects of LPS in pathological situations. 

LBP is a 60 kDa glycoprotein synthesized in the liver and present in normal human serum. 
LBP expression is upregulated in response to infectious, inflammatory, and toxic mediators. LBP 

20 expression has been induced in animals challenged with LPS, silver nitrate, turpentine, and 
Corynebacterium parvum (Geller et al., Surgery 128:22-28 (1993); Gallay et al., Infect. Lnmun. 
61:378-383 (1993); Tobias et al., J. Exp. Med. 164:77-793 (1986)). LBP levels are correlated with 
exposure to LPS, and elevated levels (particularly persistent elevated levels) have been correlated with 
poor clinical outcomes in septic patients (U.S. Patent Nos. 5,484,705, and 5,804,367, hereby 

25 incorporated by reference in their entirety). 

A portion of the LBP molecule (the N-terminal 1-197 aa) binds to the lipid A portion of the 
LPS molecule to form a high affinity LBP/LPS complex (Tobias, et al., J. Biol. Chem 264: 10867- 
10871 (1989)). The LBP/LPS complex potentiates the cellular response to LPS via an interaction 
with the monocytic differentiation antigen CD14 (Wright et al., Science. 249: 1431-1433 (1990); 

30 Lee et al., J. Exp. Med. 175:1697-1705 (1992)). LPS can be transferred from LBP to membrane- 
bound or soluble CD 14. Activated CD 14 can then interact with endothelial cells to elicit an 
inflammatory response. The C-terminal portion of LBP is required to transfer LPS to CD 14 (U.S. 
Pat. No. 5,731,415; Theofan et al., J. Immunol. 152:3624-29 (1994); Han et al., J. Biol. Chem. 
269:8172-75 (1994)). Evidence also suggests that LBP can neutralize LPS by an interaction with 

35 serum lipoproteins or through the internalization of an LBP/LPS/CD14 complex by neutrophils 
(Wurfel et al., J. Exp. Med. 180:1025-1035 (1994); Wurfel et al., J. Exp. Med. 181:1743-54 (1995); 
Gegner et al., J. Biol. Chem. 20:5320-5325 (1995)). 
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The subject invention provides the polypeptide of SEQ ID NO: 271 and polynucleotide 
sequences encoding the amino acid sequence of SEQ ID NO: 271. In a one embodiment, the 
polypeptides of SEQ ID NO: 271 are interchanged with the polypeptides encoded by the human 
cDNA of clone 181-20-3-0-B5-CS. Also included in the invention are biologically active fragments 
5 of the protein of SEQ ID NO: 271 and polynucleotide sequences encoding these biologically active 
fragments. In a preferred embodiment, biologically active fragments of SEQ ID NO: 271 are 
encoded by clone 181-20-3-0-B5-CS and comprise the first 181 amino acids encoded by clone 181- 
20-3-0-B5-CS. "Biologically active fragments" are defined as those peptide or polypeptide 
fragments of SEQ ID NO: 271 which have at least one of the biological functions of the full length 

10 protein (e.g., the ability to bind bacterial LPS). 

The invention also provides variants of SEQ ID NO: 271. These variants have at least 
about 80%, more preferably at least about 90%, and most preferably at least about 95% amino acid 
sequence identity to the amino acid sequence of SEQ ID NO: 271. Variants according to the 
subject invention also have at least one functional or structural characteristic of SEQ ID NO: 271, 

15 such as the biological functions described above. The invention also provides biologically active 
fragments of the variant proteins. Unless otherwise indicated, the methods disclosed herein can be 
practiced utilizing the polypeptide of SEQ ID NO: 271 or variants thereof. Likewise, the methods 
of the subject invention can be practiced using biological fragments of the protein of SEQ ID NO: 
or variants of said biologically active fragments. 

20 Because of the redundancy of the genetic code, a variety of different DNA sequences can 

encode SEQ ID NO: 271. It is well within the skill of a person trained in the art to create these 
alternative DNA sequences which encode proteins having the same, or essentially the same, amino 
acid sequence. These variant DNA sequences are, thus, within the scope of the subject invention. 
As used herein, reference to "essentially the same sequence" refers to sequences that have amino 

25 acid substitutions, deletions, additions, or insertions that do not materially affect biological activity. 
Fragments retaining one or more characteristic biological activity of SEQ ID NO: are also included 
in this definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 
30 code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 
viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

The protein of SEQ ID NO: 271, and variants thereof, can be used to produce antibodies 
according to methods well known in the art. The antibodies can be monoclonal or polyclonal. 
35 Antibodies can also be synthesized against fragments of SEQ ED NO: 271, as well as variants 
thereof, according to known methods. The subject invention also provides antibodies which 
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specifically bind to biologically active fragments of SEQ ID NO: 271 or biologically active 
fragments of SEQ ID NO: 271 variants. 

The subject invention also provides for immunoassays which are used to screen for, 
monitor, or diagnose exposure to LPS. In one embodiment, diagnostic assays measure the level of 
5 LBP in patient plasma samples. LBP levels are known to rise in response to exposure to LPS, thus 
the measurement of the level of the protein of SEQ ID NO: 271 can provide an early indication of 
Gram-negative infection or of endotoxin exposure. 

The subject invention provides methods of treating individuals infected with Gram negative 
bacteria comprising the administration of therapeutically-effective compositions comprising SEQ 

10 ID NO: 271. In one embodiment, the protein lacks the C -terminal portion (or portions of the C- 
terminal domain) necessary to transfer LPS to CD 14. LPS can be scavenged by the excess N- 
terminal fragment and would be unable to induce an inflammatory response (see, eg., U.S. Patent 
No. 5,73 1,415, hereby incorporated by reference in its entirety). 

Another aspect of the subject invention provides methods of prophylaxis. The method 

15 treats individuals by administration of therapeutically-effective amounts of compositions 
comprising SEQ ID NO: 271. Instances where this aspect of the invention can be performed 
include, but are not limited to, conditions associated with increased translocation of gut bacteria and 
endotoxin, particularly prior to surgery. In addition, patients who are at risk for potential Gram- 
infection, including but not limited to patients undergoing chemotherapy, or patients who are 

20 immunocompromised (for example with AIDS) can benefit from such treatment. Such uses are 
described in U.S. Patent No. 5,990,082, hereby incorporated by reference in its entirety. 

The N-terminal portion of LBP, which lacks the ability to induce an inflammatory response, 
can be fused to other proteins or fragments thereof (such as the bactericidal/permeability-increasing 
protein or BPI) which can increase the association of these molecules with LPS and aid in the 

25 clearance of endotoxin from patients who have been exposed to Gram negative bacteria. Such 
preparations can be used to treat and inhibit a number of Gram-negative infections, Gram positive, 
or fungal infections, as described in the following patents: WO 95/19179 A, WO 95/19180 A, WO 
95/19372 A, and WO 96/34873 A, each of which is incorporated by reference in its entirety. 

The subject invention also provides methods of removing endotoxin from recombinantly- 

30 produced proteins. In one embodiment, the recombinantly-produced proteins are obtained from 
Gram-negative bacteria. In a preferred embodiment, the bacteria are E. coli. In another 
embodiment, the protein of SEQ ID NO: 271, biologically active fragments thereof, variants, or 
derivatives thereof, are contacted with compositions comprising recombinantly-produced proteins. 
The contacting step can take place with SEQ ID NO: 271 immobilized on a substrate or with SEQ 

35 ID NO: 271 present in free solution. 

In addition, protein of SEQ ID NO: 271, biologically active fragments, or derivatives 
thereof, can be used in diagnostic assays to measure the level of LPS in patient plasma samples. In 
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such an assay, serum samples would be bound to a solid matrix, such as a membrane, plastic, 
treated plastic, or other supports, and then cloned with the protein of SEQ ID NO: 271 . 
Visualization can be achieved by fusing protein of SEQ ID NO: to any number of enzymes 
followed by treatment with a chromogenic, fluorogenic, or luminescent substrate. Alternatively, the 
5 protein of SEQ ID NO: 271, biologically active fragments, variants, or derivatives thereof, can be 
linked to a fluorescent or luminescent protein or compound. The linkage can be chemical or made 
by recombinant techniques known to those skilled in the art. In addition, antibodies raised against 
the protein of SEQ ID NO: 271, biologically active fragments, variants, or derivatives thereof can 
be used to visualize the LPS/protein 271 complexes using immunoassays known to those skilled in 
10 the art. 

Protein of SEP ID NO:266 (internal designation 1 16-1 10-2-0-F4-CS) 

The protein of SEQ ID NO:266, highly expressed in the testis, is encoded by cDNA of SEQ 
ID NO:25 and exhibits homology to the Ly-6 family of GPI-linked cell-surface glycoproteins 
composed of one or more copies of a conserved domain of about 100 amino-acid residues 

15 (PS00983;LY6_UPAR). 

The protein of SEQ ID NO:266 shows significant structural similarities to mouse Ly-6 
antigens, human CD59 and a herpes virus CD59 homolog. The protein of SEQ ED NO:266 
displays one copy of the motif of the u-PAR/Ly-6 domain, with all ten extracellular cysteine 
residues conserved. The mature protein sequence contains a relatively high proportion of cysteine 

20 residues (10/105), which suggests that numerous disulfide bonds stabilize its tertiary structure. 

Furthermore, the 124 amino-acid long protein of SEQ ID NO:266 has a size very similar to that of 
many members of the Ly-6 family. In addition, the protein of the invention has a predicted signal 
peptide structure (positions from 1 to 19) and a C-terminal hydrophobic fragment (positions from 
101 to 121) necessary for GPI-anchoring in a membrane. Thus, the protein of the invention has a 

25 clear evolutionary relationship with the Ly-6/uPAR family, particularly with Ly-6 subfamily. 

The Ly-6/uPAR protein family members share one or several repeat units of the Ly-6/uPAR 
domain, which is defined by a distinct disulfide bonding pattern between 8 or 10 cysteine residues. 
This family can be divided into two subfamilies. One comprises GPI-anchored glycoprotein 
receptors with 10 cysteine residues. Another subfamily includes the secreted single-domain snake 

30 and frog cytotoxins, and differs significantly in that its members generally possess only eight 
cysteines and no GPI-anchoring signal sequence (Andermarai K, et al. Protein Sci 8(4):810-819 
(1999)). The Ly-6 family members are low molecular weight phosphatidyl inositol anchored 
glycoproteins with remarkable amino acid homology throughout a distinctive cysteine rich protein 
domain that is associated predominantly with O-linked carbohydrate. Their GPI links are necessary 

35 to anchor these cell surface proteins to the outside of the lipid bilayer membrane. The Ly-6 family 
includes human CD59, which protects from complement-mediated membrane damage, squid Sgpl 
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and Sgp2, urokinase plasminogen activator receptor, murine Sca-1 and Sca-2, and many other 
proteins. The general structure seen within the Ly-6 family resembles that of the receptor for a 
urokinase-type plasminogen activator and the alpha- neurotoxins from snake venoms (Fleming T J 
et al J Immunol 150:5379-5390 (1993); Ploug M and V Ellis FEBS Lett 349:163-168 (1994)). 
5 The Ly-6 cell surface proteins are differentially expressed in several hematopoietic lineages 

that appear to function in signal transduction and cell activation predominantly on lymphoid cells in 
the mouse. Analyses using anti-Ly-6A/E monoclonal antibodies has also demonstrated in situ 
expression of Ly-6 molecules in brain tissue (staining primary associated with vascular elements 
throughout the brain). These proteins do not appear to be expressed during embryonic or neonatal 

10 stages of development (Cray C et al. Brain Res Mol Brain Res 8(1):9-1 5 (1990)). 

Ly-6 protein expression has been shown to be factor-dependent. For example, the 
expression of the Ly-6A/E, which normally occurs in hemopoietic stem cells, fibroblasts, and T and 
B lymphocytes, has been shown to be greatly induced by IFN-iJ in various tissues and cell lines. In 
addition, the Ly-6E Ag is associated with tyrosine kinases in T cells, and reduced expression of Ly- 

15 6E in T cells impairs normal functional responses, as well as tyrosine kinase activity, in these cells. 
Further, the DFNs are important in the generation of memory CD8+ T cells, and it has been 
demonstrated that the expression of Ly-6C Ag is a strong marker for the memory phenotype 
(Mehran M. et al. Journal of Immunology 163: 811-819 (1999)). Like their murine counterparts, a 
human homologue of Ly-6 genes, the 9804 gene, is responsive to IFNs. The 9804 gene is also 

20 inducible by retinoic acid during differentiation of acute promyelocyte leukemia cells. Further, 
cultured glial and neuronal cells express high levels of Ly-6A/E following incubation with 
cytokines, including rIFN-gamma. (Cray C et al. Brain Res Mol Brain Res 8(1):9-15 (1990)). 
Another member of the Ly-6 family, human protein RoBo-1, shows increased expression in 
response to two modulators of bone metabolism, estradiol and intermittent mechanical loading, 

25 suggesting a role in bone homeostasis (Noel LS et al. J Biol Chem, Vol. 273(7): 3878-3883 (1998)). 
Such factor-dependence of expression makes Ly-6 proteins either candidates or targets for 
alloresponses and autoimmune disease. For example, the high level factor-induced expression of 
LY-6s has been associated with lupus nephritis (Blake P G et al. J Am Soc Nephrol 4: 1 140-1 1 50 
(1993)). 

30 Murine Ly-6 molecules have interesting patterns of tissue expression during 

haematopoiesis, from multipotential stem cells to lineage committed precursor cells, and on specific 
leukocyte subpopulations in the peripheral lymphoid tissues. These patterns suggest an intimate 
association between the regulation of Ly-6 expression, and the development and homeostasis of the 
immune system (Gumley TP et al. Immunol Cell Biol 73(4):277-296 (1995)). Ly-6M messenger 

35 RNA (mRNA) is easily detectable in hematopoietic tissue (bone marrow, spleen, thymus, peritoneal 
macrophages) as well as kidney and lung (Patterson JM et al. Blood 95(1 0):3 125-3 132 (2000)). 
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Normally, human blood cells are protected against autologous complement activation by 
membrane proteins that block the assembly of functional complement pores. One such protein is 
human Ly-6 CD59. Administration of CD59 prevents hemolytic disease or thrombosis. Further, 
the CD59 protein may prevent the complement-mediated lysis and activation of endothelial cells 
5 that leads to hyper acute rejection, and therefore may be administered during xenogeneic organ 
transplantation (Binette, J. P. and Binette, M. B., Scanning Microcs., 7:1 107-10 (1993)). 

The surface receptor for urokinase plasminogen activator (uPAR) has been recognized in 
recent years as a key molecule in regulating plasminogen mediated extracellular proteolysis. 
Surface plasminogen activation controls the connections between cells, basement membrane and 

10 extracellular matrix, and therefore the capacity of cells to migrate and invade neighboring tissues 
(Roldan AL et al. EMBO J 9(2):467-474 (1990)). Certain factors of the PA system, such as u-PAR, 
have been detected in organs of the male reproductive tract in various species. The morphological 
study provide support for the involvement of the PA system in human male reproductive physiology 
(Gunnarsson M et al. Mol Hum Reprod 5(10):934-940 (1999)). 

15 LY-6 proteins have been suggested to play important roles in disorders such as cancers, 

nephopathies, autoimmune diseases, hemolytic disease, thrombosis, Alzheimer's disease, etc. 
Several members of the murine Ly-6 supergene family are clearly involved in the progression of 
certain mouse tumors, as their expression level is higher in highly malignant cells than in tumor 
cells with a lower malignancy phenotype. Sorting by flow cytometry of tumor cells to 

20 subpopulations expressing either high or low levels of Ly-6E.l yielded cells expressing a high or a 
low malignancy phenotype, respectively. Further, it was shown that LY-6 is highly expressed on 
non-lymphoid tumor cells originating from a variety of tissues in mice. Upregulation or high 
expression is correlated with a more malignant phenotype which results in higher efficiency of local 
tumor production (Katz et al Int J Cancer 59:684-91 (1994)) . 

25 Cells derived from angiogenic tumors express a higher tumorigenicity phenotype and a 

higher capacity to produce artificial pulmonary metastases than cells from the poorly angiogenic 
tumors. These cells also express significantly higher levels of the lymphocyte activation protein 
Ly-6E, so the angiogenic phenotype appears to be coregulated with Ly-6 (Sagi-Assif O et al. 
Immunol Lett 54(2-3):207-l 3 (1996)). Some LY-6 proteins also block secretion of interleukin II 

30 (IL-2) which is an approved anticancer agent and a key regulatory hormone in cell-mediated 
immunity (Fleming T J and T R Malek J Immunol 153: 1955-62 (1994)). IL-2 stimulates the 
proliferation of both T and natural killer cells and activates NK cells which can directly lyse freshly 
isolated, solid tumor cells. 

The high malignancy, high Ly-6E.l -expressing cells also expressed high levels of the 

35 receptor for urokinase plasminogen activator (uPAR), whereas low malignancy, low Ly-6E.l- 

expressing cells also expressed low levels of uPAR. Transfection studies have indicated that uPAR 
is causally involved in conferring a high malignancy phenotype upon tumor cells expressing high 



386 



WO 01/42451 PCT/IB00/01938 
levels of Ly-6E.L E48, a human homologue of the murine ThB Ly-6 protein, is expressed on head 
and neck squamous carcinoma cells. In E48-stimulated cells, the binding of E48 to its 
microenvironmental ligand appears to transduce a signal that up-regulates the expression of the FX 
enzyme in these cells, leading to an increase in the levels of GDP-L-fucose (Rinat Eshel et al. J 
5 Biol Chem, Vol. 275(17): 12833-12840 (2000)). A congenital disorder of leukocyte adhesion to 
vascular endothelium termed LADII is reflected in a generalized fucose deficiency and major 
defects in leukocyte trafficking and function. Ly-6 loss-variants of a murine tumor exhibit 
alterations in the incorporation of fucose and mannose into cellular glycoconjugates (Witz IP J. 
Cell. Biochem. Suppl. 34:61-66 (2000)). 

10 It is believed that the protein of SEQ ID NO:266 is a novel member of the Ly-6 protein 

family, and is thus a specific cell-surface glycoprotein antigen involved in signal transduction and 
cell activation, proliferation and differentiation. Preferred polypeptides of the invention are 
polypeptides comprising the amino acids of SEQ ID NO:266 from position 1 to position 18 and 
from position 19 to position 124. Other preferred polypeptides of the invention are any fragments 

15 of SEQ ID NO:266 having any of the biological activities described herein. 

In one embodiment, this invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably testis. For 
example, the protein of the invention or part may be used to synthesize specific antibodies using any 
technique known to those skilled in the art. Such tissue-specific antibodies may then be used to 

20 identify tissues of unknown origin, such as forensic samples, differentiated tumor tissue that has 
metastasized to foreign bodily sites, etc., or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. 

Another embodiment of the present invention relates to methods of using of the protein of 
the invention or part thereof and related compounds and derivatives to diagnose developmental and 

25 malignant disorders in tissues including urogenital tissues and other tissues of the reproduction 

system of both sexes. For example, a biological sample is obtained from a patient with cancer or at 
risk of developing cancer, and the level of SEQ ID NO:25 polynucleotides or encoded polypeptides 
is detected within the cells of the sample. The detection of an elevated level of the SEQ ID NO:25 
polynucleotides or encoded polypeptides in the sample relative to a control level indicates the 

30 presence of malignant cells within the patient. The expression of the protein of the invention can be 
investigated using any of a number of methods, including, but not limited to, Northern blotting, RT- 
PCR or immunoblotting. 

Another embodiment of the invention relates to compositions and methods using the protein 
of the invention or part thereof in recombinant protein form as pharmacological agents in the 

35 treatment of developmental and malignant disorders in tissues including urogenital tissues and in 
other tissues of human reproduction system. Particulary, the protein of the invention or part thereof 
can be used in the treatment of disorders which are manifested by male sterility. 
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In another embodiment of the invention, antibodies which bind to the protein of the 
invention or part thereof are used in the treatment of rumors, e.g., human urogenital tumors, 
especially to enhance the secretion of interleukin II, which is an approved anticancer agent and key 
regulatory hormone in cell-mediated immunity. Such antibodies can be used alone or bound to a 
5 substance capable of ablating or killing cells as a therapy for urogenital disorders or cancers in 
which the protein of the invention is overexpressed. 

The protein of the invention or part thereof may also be used in the treatment of diseases 
which can require transplantation, including various forms of cancers such as genitourinary cancers, 
carcinomas, sarcomas, atherosclerosis, angiogenesis, and benign tumors. As mentioned above, Ly- 

10 6 family includes several proteins which are similar to the protein of the invention and which are 
capable of protecting cells from complement-mediated membrane damage. Therefore, in another 
embodiment of the invention, recombinant proteins encoded by SEQ ID NO:25 or fragments 
thereof are administered during xenogeneic tissue transplantation to prevent complement-mediated 
lysis and to block activation of endothelial cells, which normally leads to hyper-acute rejection. 

1 5 In addition, prevention of complement-mediated lysis may be particulary important in 

human and animal reproductive therapy, where functional survival of the germ cells during in vitro 
handling is crucial. Storage of sperm is of widespread importance in commercial animal breeding 
programs, human sperm donor programs, and in the treatment of certain disease states. For 
example, sperm samples may be frozen for men who have been diagnosed with cancer or other 

20 diseases that may eventually interfere with sperm production, as well as for assisted reproduction 
purposes where sperm may be stored for use at other locations or times. The procedures utilized in 
such cases include: washing a sperm sample to separate out the sperm-rich fraction from non-sperm 
components of a sample such as seminal plasma or debris; further isolating the healthy, motile 
sperm from dead sperm or from white blood cells in an ejaculate; freezing or refrigerating of sperm 

25 for use at a later date or for shipping to females at differing locations; extending or diluting sperm 
for culture in diagnostic testing or for use in therapeutic interventions such as in vitro fertilization or 
intracytoplasmic sperm injection (Cohen et al. 12 : 994-1001 (1997)). Once sperm have been 
washed or isolated, they are then extended (or diluted) in culture or holding media for a variety of 
uses (sperm analysis, diagnostic tests, assisted reproduction). Each of these uses for extended or 

30 diluted sperm requires a somewhat different formulation of basal medium (see, for review, US 

Patent No. 6,140,121 Ellington et al. Oct. 2000); however, in all cases sperm survival is suboptimal 
outside of the female reproductive tract. Novel additional components of a dilution or storage 
medium which could improve the functional preservation of sperm would be useful. Therefore, in 
another preferred embodiment of this invention, purified recombinant proteins encoded by SEQ ID 

35 NO:25 or fragments thereof can be added as components of pharmacological media designed to 
protect spermatozoa. The methods used to compose such preservation media are generally known 
by those skilled in the art (for ex., Oliver S.A . et al. US patent 5,897,987 Apr.1999; Cohen J. et al., 
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supra). Inversely, in yet another embodiment of this invention, ligands, inhibitors, neutralizing 
antibodies or other biological agents which recognize the protein of the invention and which bind it 
and which block it can be used as components of pharmacological formulations designed for male 
contraception purposes. 

5 In still another embodiment of this invention, chimeric ligands or derivatives which 

recognize the protein of the invention or part thereof and which could be internalized into cell can 
be used to design a system of drug delivery finely targeted toward urogenital and other tissues 
which express the protein of SEQ ID NO:266. For example, such recognizing molecules can be 
incorporated into the membranes of liposomes to allow the specific delivery of the liposomes to 
10 cells expressing the protein of SEQ ID NO:266. Methods of designing such drug delivery systems 
are known by those skilled in the art (Smith HJ. Introduction to the principles of drug design and 
action, 3 rd ed. (1998)). 

Proteins SEQ ID NOs:417, 413, 418 (internal designations 188-45-1-0-D3-CS. 188-26-4-0-F5-CS, 
and 188-5-1-0-H6-CS) 

15 The proteins of SEQ ID NOs:417, 413, and 418, encoded by the cDNAs of SEQ ID NOs: 

176, 172, and 177, are expressed in the brain and exhibit strong homology with proteins with redox 
activity (see, e.g. Genbank accession numbers AK001293 and AF029689, and Geneseqp accession 
number: Y59180). 

The protein of SEQ ID No:418 (320 amino acids) is a variant of AK001293 (322 amino 

20 acids). AK001293 has six extra nucleotides, within the same ORF, as SEQ ID No:418, producing a 
longer protein. SEQ ID NO:418 exhibits the Pfam Zinc -binding dehydrogenase (adh zinc) 
signature from positions 16 to 313. SEQ ID NO:418 presents all the conserved residues of the 
motif except for a histidine that is thought to be a zinc-ligand. This lack of zinc-ligand residues is a 
feature of the quinone oxidoreductases (QOR), a subfamily of zinc-binding dehydrogenases. 

25 SEQ ID NO:413 (191 amino acids) shares the first 172 amino acids with SEQ ID NO:418. 

The deletion of one nucleotide at position 583 in the SEQ ID NO:413 cDNA sequence 
(corresponding to amino acid 173), however, creates a change of ORF compared to SEQ ID 
NO:418and AK001293. 

SEQ ID NO:417 is a short protein (20 amino acids) whose sequence corresponds to the N- 

30 terminal end of the other proteins of the invention. The presence of a T (instead of a G in public 
sequences and SEQ ID NOs:413 and 418) at position 128 on the cDNA creates a STOP codon, 
creating a shorter protein. 

SEQ ID NOs:417, 413 and 418 are similar to the QORs, a family of zinc-binding 
dehydrogenases. QORs are cytoplasmic redox-regulated flavoenzymes that catalyze the one or two- 

35 electron reduction of quinones. QORs bind NADP and are inhibited by dicoumarol. 
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The activity of QORs protects cells against toxicity, mutagenicity, and cancer due to 
exposure to environmental and synthetic quinones and their precursors. Thus, QORs play a central 
role in monitoring cellular redox state and act to protect against oxidative stress induced by a 
variety of metabolic situations (Raina A.K. et al. (1999) Redox Rep. 4:23-7). The oxidoreductase 
5 activity also permits the activation of bioreductive anticancer drugs (Begleiter A. et al. (1996) Br. J. 
Cancer Suppl. 27:S9-14). 

The metabolism of quinones involves enzymatic reduction of the quinone by one or two 
electrons. In the activation of quinone-containing antitumor agents, this reduction results in the 
formation of the semiquinone or the hydroquinone of the anticancer drug. The consequence of 

10 these enzymatic reductions is that the semiquinone yields its extra electron to oxygen with the 
formation of superoxide radical anion and the original quinone. This reduction by a reductase 
followed by oxidation by molecular oxygen (dioxygen) is known as redox-cycling and continues 
until the system becomes anaerobic. In the case of a two-electron reduction, the hydroquinone 
could become stable, and as such, be excreted by the organism in a detoxification pathway. 

15 The cellular antioxidant response is mediated by a battery of detoxifying/defensive proteins. 

The promoters of genes that encode these proteins contain a common cis-element termed the 
antioxidant response element (ARE). Many transcription factors, including Nrf, Jun, Fos, Fra, Maf, 
YABP, ARE-BP 1, Ah (aromatic hydrocarbon) receptor, and estrogen receptor bind to the ARE 
from various genes. Among these factors, Nrf- Jun heterodimers positively regulate ARE-mediated 

20 expression and induction of genes in response to antioxidants and xenobiotics (reviewed in 

Dhakshinamoorthy S. et al. (2000) Curr. Top Cell Regul. 36:201-16). On the other hand, c-Fos 
represses ARE-mediated gene expression (Venugopal, R., and Jaiswal, A.K. (1996) Proc. Natl. 
Acad. Sci. USA 93, 14960-5). 

Elevated levels of QOR activity have been reported in several kinds of rumors such as liver, 

25 colon, lung and breast (Belinsky M., Jaiswal A.K., (1993) Cancer Metastasis Rev 12:103-17). 
Bioreactive antitumor agents are an important class of anticancer drugs that require activation by 
reduction. For this reason, QORs are a potential target on which to base the development of new 
antitumor compounds. Certain QORs have already been implicated in the metabolism, activation 
and mechanism of cytotoxicity of some anticancer drugs such as mitomycin C, indoloquinone E09 ( 

30 Ross D. et al. (1994) Oncol. Res. 6:493-500), CB 1954 (Knox R.J. et al. (2000) Cancer Res. 
60:4179-86) or antiestrogens in breast cancer (Montano M.M., Katzenellenbogen B.S. (1997) 
PNAS 94:2581-6). 

In addition, some of the proteins of the QOR family are thought to play a role in the 
prevention of apoptosis following oxidative stress. The tumor suppressor gene p53 has been 
35 directly implicated in the induction of apoptosis in dividing cells and in hippocampal pyramidal 
neurons (Jordan J. et al. (1997) J. Neurosci 17:1397-405) and a QOR gene has been described as a 
p53-regulated gene (Kostic C, Shaw P.H. (2000) Oncogene 19:3978-87). 
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It is believed that the proteins of SEQ ED NOs:417, 413 and 418 have a redox activity, most 
likely as QORs. Thus, they are expected to act as an endogenous antioxidant against oxidative 
stress and may be able to use NADP as cofactor. The proteins of the invention may be used to 
deactivate toxins and to activate bioreductive anticancer drugs. In addition, they may prevent 
5 apoptosis following oxidative stress and be regulated by p53. Because proteins SEQ ID NOs:41 7 
and 413 do not contain the Pfam Zinc-binding dehydrogenase (adh zinc) signature, in contrast to 
SEQ ID NO:418, they may act as a competitive inhibitor, i.e. a dominant negative form, of the 
functional protein. 

The oxidoreductase activity of the proteins of the invention may be assayed using any 

10 technique known to those skilled in the art. For example, the measurement of the rate of oxidation 
of NADPH and oxygen consumption, and the detection of the semiquinone and reactive oxygen 
species, may be performed as described by Gutierrez P.L. (Gutierrez P.L . (2000) Front. Biosci. 
5:D629-38), or by any other method skilled in the art. The enzymatic activity of the proteins of the 
invention in different affected and control tissues may be assayed by histochemical staining. To 

15 confirm the role of the proteins of the invention in the cellular antioxidant response, in vitro and in 
vivo assays may be performed. Transcription levels of the genes coding for the proteins of the 
invention may be measured using standard techniques after exposure to quinones or derived 
compounds as beta-naphtoflavone (beta-NF), as described by Belinsky M. and Jaiswal A.K. (supra), 
as well as in response to transcription factors such as Nerf, Jun and c-Fos, or in the presence of p53. 

20 In one embodiment of the present invention, the present protein can be used to detect 

specific cell types in vitro or in vivo. For example, as the present proteins are overexpressed in the 
brain, reagents capable of specifically recognizing the present protein can be used as markers for 
brain cells. Brain-specific markers have a number of uses, including for the identification of 
specific tissues for histological analyses, as well as to detect the origin of tumor cells. In addition, 

25 as the expression of the present protein is likely induced by transcription factors such as Nrf, Jun, 
Fos, Fra, Maf, YABP, ARE-BP 1, Ah (aromatic hydrocarbon) receptor, and estrogen receptor, as 
well as by p53, reagents specific for detecting the present protein can also be used as a marker for 
the activity of any of these proteins in vitro or in vivo. In view of the association between many of 
these proteins and diseases such as cancer, the ability to detect the presence or absence of the 

30 proteins provides powerful tools for disease diagnosis and screening. For any of these applications, 
the expression of the present protein can be detected using any standard method, including Northern 
blots, western blots, in situ hybridization, PCR, etc. 

In another embodiment, the proteins of the invention can serve as markers for cellular 
oxidative stress in vivo and in vitro. As such, the proteins of the invention or part thereof may be 

35 useful in the diagnosis of disorders in which oxidative stress is implicated, including a large variety 
of types of cancer as well as neurodegenerative disorders such as Alzheimer's disease (AD), 
amyothropic lateral sclerosis (ALS) or Parkinson disease (PD). For diagnostic purposes, the 
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expression of the protein of the invention may be investigated using, e.g. Northern blotting, RT- 
PCR or immunoblotting methods and compared to the expression in control individuals. An 
increased levels of the proteins of the invention in patients compared with controls indicates a major 
shift in redox balance and, thus, indicates the presence of the disease or of a susceptibility for the 
5 disease. 

The invention further relates to methods and compositions using the proteins of the 
invention or part thereof to prevent and/or treat disorders in which oxidative stress is implicated, 
including those mentioned above. For these purposes the proteins themselves, or polynucleotides 
encoding the proteins, or an activator of protein expression may be administrated to patients, or to 

10 disease-free individuals in case of increased susceptibility to one of these disorders. 

In another embodiment, the protein of the invention or part thereof is used to prevent cells 
from undergoing apoptosis. They may thus be useful in the diagnosis, treatment and/or prevention 
of disorders and processes in which apoptosis is deleterious, including but not limited to immune 
deficiency syndromes (including AIDS), type I diabetes, pathogenic infections, cardiovascular and 

15 neurological injury, alopecia, aging, degenerative diseases including AD and PD, dystonia, Leber's 
hereditary optic neuropathy and schizophrenia. For all such diagnostic purposes, the expression of 
the proteins of the invention can be investigated using any of the Northern blotting, RT-PCR or 
immunoblotting methods described herein and compared to the expression in control individuals. 
The invention relates to methods and compositions using the proteins of the invention or 

20 part thereof as detoxifying enzymes against quinones. There are a variety of quinones with a toxic 
effect in cells (e.g. quinones derived from the oxidation of phenolic metabolites of benzene, DA- 
quinones, or menadione). Thus, the proteins of the invention or part thereof may be protective 
against the hematotoxic and carcinogenic effects of benzene, as well as against benzene-caused 
diseases such as cancer, aplastic anemia and pancytopenia. Moreover, they may detoxify DA- 

25 quinones in the brain, thereby providing neuroprotection in Parkinson's Disease. In still another 
embodiment, the proteins of the invention or part thereof may protect cells against menadione- 
induced oxidative stress, with known effects on myocardial cells (Floreani M. et al (2000) Biochem 
Pharmacol. 60:60 1 -5). For prevention and/or treatment purposes the proteins themselves, or 
polynucleotides encoding the proteins, or an activator of protein expression may be administrated. 

30 In another embodiment, the present proteins may be a target of chemotherapy specific to 

different kinds of cancer, to ensure a favorable response to anticancer drugs. Specifically, proteins 
of the invention or part thereof may be used as an activator of cytotoxic prodrugs of quinone family. 
Accordingly, the protein of the invention or part thereof may be administered to a patient in 
conjunction with a bioreductive anticancer agent in order to activate the drug. This co- 

35 administration may be by simultaneous administration, such as a mixture of the oxidoreductase and 
the drug, or by separate simultaneous or sequential administration. Cancer-specific antitumor 
agents based on QOR substrates may be designed as described by Xing J. et al. (Xing J. et al. (2000) 
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Med. Chem. 43:457-66) and assayed as described in Li B. et al. (Li B et al. (1999) Chem. Res. 
Toxicol. 12: 1042-9). Alternatively, as the present proteins may be overexpressed in tumor cells, 
such methods may be performed by simply detecting the level of the present protein in tumor cells, 
and administering the prodrug specifically to those patients found to have elevated levels of the 
5 protein in their tumor cells. 

Proteins of SEP ID NOs: 415. 310. 317 (internal designation 188-29-2-0-H1-CS, 188- 18-4-0- A9- 

CS. 188-9-2-0-E1-CS) 

Mammalian inositol hexakiphosphate kinase 2 (IP6K2), an enzyme of the inositol 

phosphate pathway, has been cloned and described by two independent groups [Saiardi, A.; 
10 Erdument-Bromage, H.; Snowman, A. M.; Tempst, P.; and Snyder, S. H., (1999) Current Biology 9, 

1323-1326, and Katai, K.; Miyamoto, K-L; Kishida, S.; Segawa, H.; Nii, T.; Tanaka, H.; Tani, Y.; 

Arai, H.; Tatsumi, S.; Morita, K.; Taketani, Y.; and Takeda, E. (1999) Biochem. J. 343, 705-712]. 

Newly identified consensus sequences of inositol-polyphosphate kinases are represented by [LV]- 

[LA]-[DE]-X(3-8)-P-X-[VAI]-[ML]-D-X-K-[ML]G [Saiardi, A.; Erdument-Bromage, H.; 
15 Snowman, A. M.; Tempst, P.; and Snyder, S. H. (1999) Current Biology 9, 1323-1326]. IP6K2 

catalyzes the transfer of phosphate groups from InsP6 or Ins(l, 3,4,5, 6)P5 (the substrate), to another 

protein or small molecule, such as a nucleoside di-phosphate. 

The subject invention provides the polypeptides of SEQ ID NOs:415, 310, and 317, 

encoded by the cDNAs of SEQ ID NOs: 174, 69, and 76, respectively. The invention also provides 
20 biologically active fragments of SEQ ID NOs:415, 310, and 317. In one embodiment, the 

polypeptides of SEQ ID NOs:415, 310, and 317 are interchanged with the corresponding 

polypeptides encoded by the human cDNA of clone 188-29-2-0-H1-CS, 188-18-4-0-A9-CS, or 188- 

9-2-0-E1 -CS. "Biologically active fragments" are defined as those peptide or polypeptide 

fragments having at least one of the biological functions of the full length protein (e.g., kinase 
25 activity). Compositions of the protein/polypeptide of SEQ ID NOs:415, 3 10, or 3 17, or biologically 

active fragments thereof, are also provided by the subject invention. These compositions may be 

made according to methods well known in the art. 

The invention also provides variants of the protein of SEQ ID NOs:4 1 5, 3 10, or 3 1 7. These 

variants have at least about 80%, more preferably at least about 90%, and most preferably at least 
30 about 95% amino acid sequence identity to the amino acid sequences encoded by SEQ ID NOs:415, 

3 10, and 317. Variants according to the subject invention also have at least one functional or 

structural characteristic of the protein of SEQ ID NOs:415, 310, or 317. The invention also 

provides biologically active fragments of the variant proteins. Compositions of variants, or 

biologically active fragments thereof, are also provided by the subject invention. These 
35 compositions may be made according to methods well known in the art. Unless otherwise 

indicated, the methods disclosed herein can be practiced utilizing the protein encoded by SEQ ID 
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NO:415, 310, or 317, biologically active fragments of SEQ ID NO:415, 310, or 317, variants of 
SEQ ID NO:415, 310, or 317, and biologically active fragments of the variants. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode the amino acid sequence of SEQ ID NO:415, 310, or 317. In a preferred embodiment, SEQ 
5 IDNO:415, 310, or 317 is encoded by clone 188-29-2-0-H1-CS, 188-18-4-0-A9-CS, or 188-9-2-0- 
El-CS, or by the cDNAs of SEQ ID NO:174, 69, or 76. It is well within the skill of a person 
trained in the art to create these alternative DNA sequences which encode proteins having the same, 
or essentially the same, amino acid sequence. These variant DNA sequences are, thus, within the 
scope of the subject invention. As used herein, reference to "essentially the same" sequence refers 

10 to sequences that have amino acid substitutions, deletions, additions, or insertions that do not 
materially affect biological activity. Fragments retaining one or more characteristic biological 
activity of the protein encoded by SEQ ID NO:415, 310, or 317 are also included in this definition. 

In one aspect of the subject invention, SEQ ID NO:415, 310, or 317, and variants thereof, 
can be used to generate polyclonal or monoclonal antibodies. Both biologically active and 

15 immunogenic fragments of SEQ ID NO:415, 3 10, or 3 17, or variant proteins, can be used to 

produce antibodies. Polyclonal and/or monoclonal antibodies can be made according to methods 
well known to the skilled artisan. Antibodies produced in accordance with the subject invention can 
be used in a variety of detection assays known to those skilled in the art. The antibodies may be 
used to agonize or antagonize the biological activity of the protein of SEQ ID NO:415, 310, or 317. 

20 The protein of SEQ ID NO:415, 3 10, or 3 17 can be used for the synthesis of nucleoside 

triphosphate (NTP) compounds. In one embodiment, the NTP compound produced is ATP, GTP, 
CTP, or TTP. In this aspect of the subject invention, SEQ ID NO:415, 310, or 317 removes a 
phosphate from InsP6 or Ins(l, 3,4,5, 6)P5 and transfers it to a nucleoside diphosphate (e.g., ADP, 
CTP, GDP, or TDP) to create a NTP. The conditions and methods for the synthesis of NTP 

25 compounds, such as ATP, are well known to the skilled artisan. Thus, the protein of SEQ ID 
NO:415, 310, or 317 has industrially useful function for the synthesis of commercially valuable 
products. 

The subject invention also provides methods of determining the relative amounts of InsP6 
or Ins(l,3,4,5,6)P5 in the cell by a kinase assay. In this aspect of the invention, SEQ ID NO:415, 
30 310, or 317 can be used to transfer phospate groups from InsP6 or Ins(l,3,4,5,6)P5 to acceptor 
substrates according to well-known kinase activity assays. 

Protein of SEP ID NQ:294 (internal designation 181-16-2-0-A7-CS) 

The protein of SEQ ID NO:294 is encoded by the cDNA of SEQ ID NO:53. It will be 
appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:294 described 
35 throughout the present application also pertain to the polypeptide encoded by the human cDNA of 
clone 18 1-16-2-0- A7-CS. In addition, it will be appreciated that all characteristics and uses of the 
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nucleic acid of SEQ ID NO:53 described throughout the present application also pertain to the 
human cDNA of clone 181-16-2-0-A7-CS. 

This gene was isolated from fetal liver and expression has also been detected in fetal 
kidney, placenta, liver, brain, hypertrophic prostate, salivary gland and testis. Data from PCT 
5 application WO 98/23435 indicate expression is primarily in bone marrow cell lines, and to a lesser 
extent, in human endometrial stromal cells, human adult small intestine and human pancreas tumor. 
PCT application WO 99/14484 reports the fraction of expression in the gastrointestinal system 
(0.227), reproductive system (0.193), and hematopoietic/immune system (0.168). Finally, this 
protein is 55% identical and 76% similar to CGI-128 protein, which was isolated from CD34+ cells 

10 and is also found in cell lines from the hematopoietic lineage including, HL6 (granulocytic), Jurkat 
(T-lymophocytic), K562 (erythro-megakaryocytic), and U937 (monocytic). 

Supernatant harvested from cells expressing the product of this gene has been shown to 
increase the permeability of the plasma membrane of renal mesangial cells to calcium. Thus, it is 
believed that the product of this gene is involved in activating a signal transduction pathway when it 

15 binds a receptor on the surface of the plasma membrane of both mesangial cells and other cell types, 
in addition to other cell-lines or tissue cell types. Thus, polynucleotides and polypeptides have 
uses, which include, but are not limited to, activating mesangial cells by contacting said cells with a 
full length polypeptide or a polypeptide fragment which demonstrates this biological activity. 
Further, the polynucleotides and polypeptides can be used in the methods described in W099 15652, 

20 incorporated in its entirety. Binding of a ligand to a receptor is known to alter intracellular levels of 
small molecules, such as calcium, potassium and sodium, as well as alter pH and membrane 
potential. Alterations in small molecule concentration can be measured to identify supernatants, 
which bind to receptors of a particular cell. In addition, when tested against fibroblast cell lines, 
supernatants removed from cells containing this gene activated the EGR1 (early growth response 

25 gene 1) promoter element. Thus, it is likely that this gene activates fibroblast cells through the 
EGR1 signal transduction pathway. EGR1 is a separate signal transduction pathway from Jak- 
STAT, genes containing the EGR1 promoter are induced in various tissues and cell types upon 
activation, leading the cells to undergo differentiation and proliferation (PCT application WO 
98/23435) 

30 Polynucleotide comprising sequences encoding the signal peptide of the protein, e.g. 

VLWLSGLSEPGAA/RQ, can be used in construction of secretion vectors. These vectors would 
then facilitate the secretion of fusion proteins into the media of cells that have been transfected with 
the construct of interest. Antibodies which specifically bine the signal peptide could be used to 
purifiy the fusion protein from the media if desired. 

35 Polynucleotides and polypeptides of the invention are useful as reagents for differential 

identification of the tissue(s) or cell type(s) present in a biological sample and for diagnosis of 
diseases and conditions which include, but are not limited to, haemopoietic and gastrointestinal tract 
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disorders and stromatosis, in addition to endothelial, mucosal, or epithelial cell disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides, are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of disorders of the 
above tissues or cells, particularly of the immune and digestive systems, expression of this gene at 
5 significantly higher or lower levels may be routinely detected in certain tissues and cell types (e.g. 
hemaopoietic, immune, reproductive, gastrointestinal, endocrine, and cancerous and wounded 
tissues) or bodily fluids (e.g. lymph, serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

10 individual not having the disorder. 

The tissue distributioin in bone marrow cells, fetal liver and fetal kidney, combined with the 
detected calcium flux and EGR1 biological activity, indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune and gastrointestinal tract disorders, and 
stromatosis, particularly tumors and proliferative disorders. More specifically, polynucleotides and 

15 polypeptides corresponding to this gene are useful for the treatment and diagnosis of hematopoietic 
related disorders such as anemia, pancytopenia, leukopenia, thrombocytopenia or leukemia since 
stromal cells are important in the production of cells of hematopoietic lineages. The polypeptides 
and polynucleotides of the invention can be used to enhance hematopoesis as described in 
W09831385, incorporated in its entirety. The uses include bone marrow cell ex vivo culture, bone 

20 marrow transplantation, bone marrow reconstitution, radiotherapy or chemotherapy of neoplasia. 
The gene product may also be involved in lymphopoiesis, therefore, it can be used in immune 
disorders such as infection, inflammation, allergy, immunodeficiency etc. In addition, this gene 
product may have commercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell types. Protein 

25 as well as, antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Additionally, since the gene product of 181-16-2-0-A7-CS has been shown to activate the 
EGR1 promoter element, it likely activates EGR1 signaling activity in fibroblasts. Recent data 
shows that activation of EGR1 plays a role in wound repair. The cellular transcription factor early 

30 growth response factor 1 (Egrl) is expressed minutes after acute injury and serves to stimulate the 
production of a class of growth factors whose role is to promote tissue repair. Egr-1 expression at 
the site of dermal wounding in rodents promotes angiogenesis in vitro and in vivo, increases 
collagen production, and accelerates wound closure. These results show that Egr-1 gene therapy 
accelerates the normal healing process (Human Gene Ther 2000, vol 1 1(15):2143-58). Thus, an 

35 activator of EGR1 signaling, specifically the gene products of 181-1 6-2 -0-A7-CS (polypeptides and 
polynucleotides), would be useful in the wound healing process using the methods described in 
W09941282 and W09932135, incorporated by reference in their entireties. 
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Protein of SEP ID NO:305 (internal designation 187-37-0-0-clO-CS) 



The protein of SEQ ID NO:305, encoded by the cDNA of SEQ ID NO:64, is highly 
expressed in the prostate and brain. The protein of the invention is strongly homologous to the D9 
protein, found in both humans (GNP accession number: U95006 and U95007) and in mice (GNP 
5 accession number: U95003, U95004, and U95005). D9 is a myeloid precursor protein transcript 
regulated by the retinoic acid receptor a, hereafter referred to as RAR-a (Scott et al. Blood 1996; 
88:2517-30). 

Retinoic acid is the active metabolite of vitamin A, which contributes to a wide range of 
biological processes such as cellular differentiation, embryogenesis, and tumor suppression. More 

10 specifically, retinoic acid stimulates myeloid precursor differentiation into mature granulocytes. 
For instance, in vitro treatment of acute promyelocytic leukemia blast cells with retinoic acid 
induces their differentiation (Miyauchi et al. Leuk Lymphoma 1999;33:267-80). In addition, 
treatment with retinoic acid can induce disease remission in patients affected with promyelocytic 
leukemia by causing granulocyte precursor differentiation (Slack et al. Ann Hematol 2000;79:227- 

15 38). 

The diverse range of responses to retinoic acid are mediated by three receptor subtypes: 
RAR-a, RAR-p, and RAR-y. RAR-a has been identified as being important for bone marrow 
maturation of granulocytes (Tsai et al. Genes Dev 1992;6:2258-69). In addition, RAR-a is almost 
invariably involved in acute promyelocytic leukemia cells by a reciprocal translocation between the 

20 long arms of chromosomes 15 and 17 (Alcalay et al., Proc Natl Acad Sci USA 1991;88:1977-81). 
This type of leukemia is mainly characterized by a predominance of malignant promyelocytes, and 
severe hemorragic manifestations resulting from activation of the coagulation cascade and the 
fibrinolytic system (Tallman et al. Semin Thromb Hemost 1999;25:209-15). Reciprocal 
chromosomal translocation leads to the production of a fusion protein that inhibits differentiation 

25 and promotes survival of myeloid precursor cells (Grignani et al. Cell 1993;74, 423-431). Transient 
transfection of a vector containing RAR-a in a promyelocyte cell line causes upregulation in an 
early manner of several genes, including D9, which is strongly related to protein of SEQ ID NO:305 
(Scott et al. Blood 1996; 88: 2517-30). Thus, it is believed that the protein of SEQ ID NO:305 is a 
myeloid-related protein whose expression is induced by the activation of retinoic acid receptors, 

30 including RAR-a. 

In a preferred embodiment, the protein of the invention or part thereof may be used to assay 
the activity of RAR-a protein or retinoic acid in a biological sample. Specifically, as the expression 
of the protein is believed to be under the direct control of retinoic acid receptors, the level of the 
protein of the invention, or of the mRNA encoding the protein, can serve as a sensitive and 
35 immediate marker for the effects of retinoic acid upon a cell. An ability to detect retinoic acid 

receptor activation in cells using the present protein has numerous uses. For instance, the protein of 
the invention or part thereof can be used to monitor the effects of retinoic acid on cells of a patient 
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undergoing retinoic acid treatment for promyelocyte leukemia (Slack et al. Ann Hematol 
2000;79:227-38). As retinoic acid treatment is associated with frequent retarded dose-dependant 
side effects, it is believed that an assay based on protein of SEQ ID NO:305 could be used to adjust 
the dose of retinoic acid administered in patients affected with promyelocyte leukemia, in order to 
5 predict and avoid such adverse side-effects (Slack et al. Ann Hematol 2000;79:227-38). 

In another embodiment, the present polypeptides and polynucleotides can be used to 
identify myeloid precursors, as well as brain and prostate tissues. The ability to specifically 
visualize myeloid precursor cells, as well as brain and prostate tissues (and cells derived from the 
tissues), is useful for any of a number of applications, including to determine the origin or identity 

10 of, e.g. cancerous cells, as well as to facilitate the identification of particular cells and tissues for, 
e.g. the evaluation of histological slides. In addition, such assays can be used to examine the extent 
of differentiation in myeloid precursor cells. 

The present invention further relates to in vitro assays and diagnostic kits based on the 
protein of the present invention or part thereof. Such assays may be used for diagnosis of disorders 

15 where the protein activity is abnormally downregulated, such as cancer, and hematological 

disorders including leukemia. As the protein of SEQ ED NO:305, RAR-oc, and acute promyelocyte 
leukemia are all related, variation in the measured level of the present protein of the invention or 
part thereof can be used as a diagnostic or screening test for acute promyelocyte leukemia, e.g. 
using a biological sample such as serum or plasma. Further, an assay that can detect an abnormal 

20 level of the protein of the invention or part thereof can be used to detect residual disease in acute 
promyelocyte leukemia. Such an assay may be used to aid therapeutic decisions in this disorder, 
e.g. more or less aggressive treatments, the duration of treatment, etc. 

In another embodiment, various methods can be used to modulate activity and/or expression 
of the protein of SEQ ID NO:305, e.g. for the treatment, attenuation and/or prevention of various 

25 disorders. In one embodiment, any of a number of reagents, e.g. polynucleotides encoding the 
protein of SEQ ID NO:305 or a fragment thereof, the protein of SEQ ID NO:305 itself, or a 
compound that increased the expression or activity of the protein of SEQ ID NO:305, can be 
administered to a patient suffering from, or at risk of developing, various disorders including 
cancer, and hematological diseases such as leukemia, and neutropenia. For instance, but not limited 

30 to it, proteins or other capable of enhancing the expression or activity of the protein of SEQ ID 
NO:305 can be administered to treat patients affected with acute promyelocyte leukemia, in order 
to induce differentiation of the affected cells into mature granulocytes (Slack et al. Ann Hematol 
2000;79:227-38). In still another preferred embodiment, proteins or other compounds capable of 
increasing the expression or activity of the protein of the invention can be used to treat, prevent 

35 and/or attenuate neutropenia or agranulocytosis patients, in order to induce in vivo differentiation of 
myeloid precursors into mature granulocytes. In still another preferred embodiment, proteins or 
other compounds capable of increasing the expression or activity of the protein of SEQ ID NO:305 
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can be used to treat coagulopathic diseases, such as thrombosis or hemorragic manifestations. For 
instance, they can be used to treat disseminated intravascular coagulation, a severe hemorragic 
syndrome. This embodiment is supported by the fact that acute promyelocyte leukemia is 
frequently associated with disseminated intravascular coagulation (Tallman et al. Semin Thromb 
5 Hemost 1999;25:209-15), and disseminated intravascular coagulation is efficiently corrected with 
retinoic acid (Dombret et al. Leukemia 1993;7:2-9). 

In addition, modulation of the expression or activity of the protein of the invention can be 
used to modulate differentiation of cells, e.g. promyelocyte leukemia. In one such embodiment, the 
protein of the invention is inhibited, e.g. using antisense molecules, antibodies, or small molecule 
10 inhibitors of the expression or activity of the protein, in order to maintain the undifferentiated state 
of cells grown in vitro. Alternatively, agents that increase the expression or activity of the protein 
in cells can be used to induce cellular differentiation, e.g. in the preparation of specific cell types in 
vitro for particular therapeutic applications. 

Protein of SEP ID NO:248 (internal designation 105-035-2-0-C6-CS) and SEP ID NO:313 
15 (internal designation 188-28-4-0-D4-CS) 

The proteins of SEQ ID NP:248, encoded by the cDNA of SEQ ID NP:7, and SEQ ID NP: 
313, encoded by the cDNA of SEQ ID NP:72, are highly expressed in brain, liver, pancreas, and 
testis. The proteins of the invention are nuclear proteins (Miller et al. J Biol Chem 
2000;275:32052-6) that display a membrane-spanning segment from amino acids 58 to 78. These 

20 proteins are homologous to the human RNA polymerase II elongation factor ELL3 (EMBL, 
accession number AF276512 ; Miller et al. J Biol Chem. 2000; 275:32052-6). In addition, the 
proteins of SEQ ID NP:248 and SEQ ID NP:3 13 share sequence homology with two other 
members of the polymerase II elongation factor family: ELL, and ELL2. The protein of SEQ ID 
NP:3 13 is similar to the N-terminal sequence the protein of SEQ ID NP:248, but differs after 

25 residue 240 because of a frameshift that produces a premature stop in the sequence SEQ ID NP:72 
(Miller et al. J Biol Chem 2000; 275:32052-6). Additionally, the alignment of the protein of SEQ 
ID NP:248 with occludin, an integral membrane protein found at tight junctions (Furuse et al. J Cell 
Biol 1994; 127:1617-26), reveals that both proteins display a C-terminal ZP-1 binding domain, with 
a 26% homology over a 108 amino acid segment. Protein SEQ ID NP:313 lacks this domain, as its 

30 C-terminal region is truncated as compared to the protein of SEQ ID NP:248. ZP-1 is part of the 
family of membrane-associated guanylate kinase homologs (MAGUKs) believed to be important in 
signal transduction originating from sites of cell-cell contact (Willott et al. Proc Natl Acad Sci USA 
1993; 90:7834-8). 

The proteins of SEQ ID NPs:248 and 313 are RNA polymerase II elongation factors that 
35 increase the catalytic rate of transcription elongation, a phase during which RNA polymerase II 
moves along the DNA and extends the growing RNA chain (Miller et al. J Biol Chem 2000; 
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275:32052-6). Specifically, the proteins of SEQ ID NOs:248 and 313 suppress transient pausing at 
multiple sites along the DNA, thereby altering the K m and/or the of the polymerase (Miller et 
al. J Biol Chem 2000; 275:32052-6). The present proteins belong to a family that is known to 
include one virally encoded protein (Tat) and six cellular proteins (SIX, P-TEFb, TFIIF, Elongin 
5 (SIII), ELL and ELL2). 

A growing body of evidence suggests that mis-regulation of elongation may be a key 
element in a variety of human diseases (see, Aso et al. J Clin Invest 1996; 97:1561-9). For instance, 
two RNA polymerase II elongation proteins have been implicated in oncogenesis: ELL, which is a 
frequent target for translocation in acute myeloid leukemia (Thirman et al. Proc Natl Acad Sci USA 

10 1994; 91:121 10-4 ; Mitani et al. Blood 1995;85:2017-24), and elongin, which is a transcription 
factor regulated by the product of the von Hippel-Lindau tumor suppressor gene, which is itself 
mutated in the majority of clear-cell renal carcinomas and in families with von Hippel-Lindau 
disease (Duan et al. Science 1995;269:1402-6, Kibel et al. Science 1995; 269:1444-6). In addition, 
overexpression of ELL leads to the transformation of fibroblasts (Kanda et al. J Biol Chem. 1998 

15 27; 273:5248-52). Thus, the proteins of SEQ ID NOs:248 and 3 13 may be important for 
oncogenesis of multiple types of neoplastic diseases, especially hematological malignancies. 

In one embodiment, the present proteins are used to increase the rate of transcription in 
vitro. Such an increase can be used for any of the large number of in vitro transcription reactions 
which are routinely used for molecular biological applications, e.g. for the preparation of RNA, for 

20 protein production, for the characterization of promoters and transcription factors, etc. 

In another embodiment, the present invention provides diagnostic tools for the detection of 
mutations in the genes encoding SEQ ID NOs:248 or 313. Such mutations may be detected by a 
variety of techniques, including RNase and SI protection assays; alterations in electrophoretic 
mobility of DNA fragments in gels, with or without denaturing agents such as SSCP or DGGE; 

25 dHPLC; and direct DNA sequencing. The detection of mutations in the genes encoding SEQ ID 
NOs:248 or 313 are useful for the detection of a number of diseases and conditions, such as cancers 
and hematological malignancies including leukemia. For example, the RNA polymerase II 
Elongation Factor ELL gene undergoes frequent translocations in acute myeloid leukemia (Thirman 
et al. Proc Natl Acad Sci USA 1994; 91:12110-4 ; Mitani et al. Blood 1995; 85:2017-24), and it is 

30 likely that other elongation factors are involved in additional such diseases. 

Another embodiment of the present inventions relates to compositions and methods for 
using the proteins or part thereof to specifically visualize myeloid precursor cells, as well as 
pancreas, liver and testis tissues (and cells derived from the tissues). The ability to detect such cell 
types is useful for any of a number of applications, including to determine the origin or identity of, 

35 e.g. cancerous cells, as well as to facilitate the identification of particular cells and tissues for, e.g. 
the evaluation of histological slides. In addition, such methods can be used to examine the extent of 
differentiation in myeloid or myeloid-progenitor cells for staging of leukemia or any other 
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neoplastic disorder. Any method for detecting the presence of the proteins of the invention, or 
nucleic acids encoding the proteins, can be used, including methods involving the use of antibodies 
immunospecific for the proteins of invention. Such antibodies can be used in various methods 
including radioimmunoassays, competitive binding assays, Western Blot analysis and enzyme - 
5 linked immunosorbant assay (ELISA) assays, or any other technique known to those skilled in the 
art. In another embodiment, the present protein or part thereof can be used for the treatment, 
attenuation and/or prevention of conditions associated with unbalanced amounts and/or activity of 
the protein of SEQ ID NO:248 or 3 13. Other modulatory substances can also be used in such 
embodiments, including chemical compounds such as agonists and antagonists, nucleic acids 

10 including antisense and ribozyme sequences, and antibodies. In a preferred embodiment, such 
substances are employed for the treatment or prevention of certain types of neoplastic disorders 
associated such as cancer or hematological malignancies such as leukemia. In such embodiments, 
where an increased level of expression or activity of the present proteins is correlated with the 
presence of a disease such as cancer, the disease can be treated or prevented using any agent that 

1 5 can provoke a decrease in the level of activity or expression of the protein, such as antibodies, 

antisense molecules, ribozymes, dominant negative forms of the protein, compounds that inhibit the 
expression or activity of the proteins, and others. Alternatively, in cases where a decreased level of 
expression or activity of the proteins is correlated with the presence of a disease such as cancer, the 
disease can be treated using any agent that can cause an increase in the expression or activity of the 

20 protein, such as polynucleotides encoding the proteins, purified forms of the proteins, or any 
compound that causes an increase in the expression or activity of the proteins. Further, any 
detection of a correlation between the level of expression or activity of the protein and the presence 
or absence of a disease can be used to develop diagnostic or screening tools for the detection of the 
disease itself, or of a predisposition for the disease. 

25 Uses of antibodies 

Antibodies of the present invention have uses that include, but are not limited to, methods 
known in the art to purify, detect, and target the polypeptides of the present invention including 
both in vitro and in vivo diagnostic and therapeutic methods. An example of such use using 
immunoaffinity chromatography is given below. The antibodies of the present invention may be 

30 used either alone or in combination with other compositions. For example, the antibodies have use 
in immunoassays for qualitatively and quantitatively measuring levels of antigen-bearing substances, 
including the polypeptides of the present invention, in biological samples {See, e.g., Harlow et al., 
1988). (Incorporated by reference in the entirety). The antibodies may also be used in therapeutic 
compositions for killing cells expressing the protein or reducing the levels of the protein in the body. 

35 The invention further relates to antibodies that act as agonists or antagonists of the 

polypeptides of the present invention. For example, the present invention includes antibodies that 
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disrupt the receptor/ligand interactions with the polypeptides of the invention either partially or 
fully. Included are both receptor-specific antibodies and ligand-specific antibodies. Included are 
receptor-specific antibodies, which do not prevent ligand binding but prevent receptor activation. 
Receptor activation (i.e., signaling) may be determined by techniques described herein or otherwise 
5 known in the art. Also include are receptor-specific antibodies which both prevent ligand binding 
and receptor activation. Likewise, included are neutralizing antibodies that bind the ligand and 
prevent binding of the ligand to the receptor, as well as antibodies that bind the ligand, thereby 
preventing receptor activation, but do not prevent the ligand from binding the receptor. Further 
included are antibodies that activate the receptor. These antibodies may act as agonists for either all 

10 or less than all of the biological activities affected by ligand-mediated receptor activation. The 
antibodies may be specified as agonists or antagonists for biological activities comprising specific 
activities disclosed herein. The above antibody agonists can be made using methods known in the 
art. See e.g., WO 96/40281; US Patent 5,81 1,097; Deng et al. (1998); Chen et al. (1998); Harrop et 
al. (1998); Zhu et al. (1998); Yoon et al. (1998); Prat et al. (1998); Pitard et al. (1997); Liautard et 

15 al. (1997); Carlson et al. (1997); Taryman et al. (1995); Muller et al. (1998); Bartunek et al. (1996) 
(said references incorporated by reference in their entireties). 

As discussed above, antibodies of the polypeptides of the invention can, in turn, be utilized 
to generate anti-idiotypic antibodies that "mimic" polypeptides of the invention using techniques 
well known to those skilled in the art (See, e.g. Greenspan and Bona (1989) and Nissinoff (1991), 

20 which disclosures are hereby incorporated by reference in their entireties). For example, antibodies 
which bind to and competitively inhibit polypeptide multimerization or binding of a polypeptide of 
the invention to ligand can be used to generate anti-idiotypes that "mimic" the polypeptide 
multimerization or binding domain and, as a consequence, bind to and neutralize polypeptide or its 
ligand. Such neutralization anti-idiotypic antibodies can be used to bind a polypeptide of the 

25 invention or to bind its ligands/receptors, and thereby block its biological activity. 

Immunoaffinity Chromatography 

Antibodies prepared as described herein are coupled to a support. Preferably, the antibodies 
are monoclonal antibodies, but polyclonal antibodies may also be used. The support may be any of 
those typically employed in immunoaffinity chromatography, including Sepharose CL-4B 

30 (Pharmacia, Piscataway, NJ), Sepharose CL-2B (Pharmacia, Piscataway, NJ), Affi-gel 10 (Biorad, 
Richmond, CA), or glass beads. 

The antibodies may be coupled to the support using any of the coupling reagents typically 
used in immunoaffinity chromatography, including cyanogen bromide. After coupling the antibody 
to the support, the support is contacted with a sample which contains a target polypeptide whose 

35 isolation, purification or enrichment is desired. The target polypeptide may be a polypeptide 
selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 
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included in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides 
encoded by the clone inserts of the deposited clone pool, variants and fragments thereof, or a fusion 
protein comprising said selected polypeptide or a fragment thereof. 

Preferably, the sample is placed in contact with the support for a sufficient amount of time 
5 and under appropriate conditions to allow at least 50% of the target polypeptide to specifically bind 
to the antibody coupled to the support. 

Thereafter, the support is washed with an appropriate wash solution to remove polypeptides 
which have non-specifically adhered to the support. The wash solution may be any of those 
typically employed in immunoaffinity chromatography, including PBS, Tris-lithium chloride buffer 
10 (0. 1 M lysine base and 0.5M lithium chloride, pH 8.0), Tris-hydrochloride buffer (0.05M Tris- 
hydrochloride, pH 8.0), or Tris/Triton/NaCl buffer (50mM Tris.cl, pH 8.0 or 9.0, 0.1% Triton X- 
100, and O.SMNaCl). 

After washing, the specifically bound target polypeptide is eluted from the support using the 
high pH or low pH elution solutions typically employed in immunoaffinity chromatography. In 
15 particular, the elution solutions may contain an eluant such as triethanolamine, diethylamine, 
calcium chloride, sodium thiocyanate, potasssium bromide, acetic acid, or glycine. In some 
embodiments, the elution solution may also contain a detergent such as Triton X-100 or octyl-beta- 
D-glucoside. 

Import vectors 

20 The GENSET polypeptides of the invention may also be used as a carrier to import a 

protein or peptide of interest, so-called cargo, into tissue-culture cells or in host organisms. A 
hydrophobic region of a GENSET polypeptide or a fragment thereof, preferably the signal peptide 
of a sequence selected from the group consisting of of SEQ ID Nos: 1 -3 1 and 33-143 and clones 
inserts of the deposited clone pool, more preferably the short core hydrophobic region (h) of signal 

25 peptides may be used as a carrier. 

When cell permeable peptides of limited size (approximately up to 25 amino acids) are to 
be translocated across cell membrane, chemical synthesis may be used in order to add the h region 
to either the C-terminus or the N-terminus to the cargo peptide of interest. Alternatively, when 
longer peptides or proteins are to be imported into cells, nucleic acids can be genetically engineered, 

30 using techniques familiar to those skilled in the art, in order to link the cDNA sequence or fragment 
thereof encoding the hydrophobic region to the 5' or the 3' end of a DNA sequence coding for a 
cargo polypeptide. Such genetically engineered nucleic acids are then translated either in vitro or in 
vivo after transfection into appropriate cells, using conventional techniques to produce the resulting 
cell permeable polypeptide. Suitable hosts cells are then simply incubated with the cell permeable 

35 polypeptide which is then translocated across the membrane. 
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This method may be applied to study diverse intracellular functions and cellular processes. 
For instance, it has been used to probe functionally relevant domains of intracellular proteins and to 
examine protein-protein interactions involved in signal transduction pathways (Lin et al, J. Biol 
Chem., 270: 14225-14258 (1995); Lin et al, J. Biol. Chem. t 271: 5305-5308 (1996); Rojas et al, J. 
5 Biol Chem., 271: 27456-27461 (1996); Rojas et al, Nature Biotech., 16: 370-375 (1998); Liu et al, 
Proc. Natl. Acad. Sci. USA, 93: 1 1819-1 1824 (1996); Rojas et al, Bioch. Biophys. Res. Commun., 
234: 675-680 (1997) Du et al, J. Peptide Res., 51: 235-243 (1998)). 

Such techniques may be used in cellular therapy to import proteins producing therapeutic 
effects. For instance, cells isolated from a patient may be treated with imported therapeutic proteins 
10 and then re-introduced into the host organism. 

Alternatively, the hydrophobic region of signal peptides of the present invention could be 
used in combination with a nuclear localization signal to deliver nucleic acids into cell nucleus. 
Such oligonucleotides may be antisense oligonucleotides or oligonucleotides designed to form triple 
helixes, as described herein, in order to respectively inhibit processing or maturation of a target 
15 cellular RNA. 

Expression of GENSET products 

Spatial expression of the GENSET genes of the invention 

Tissue expression of the cDNAs of the present invention was examined. Table DC list the 
Genset's libraries of tissues and cell types examined that express the polynucleotides of the present 

20 invention. The tissues and cell types examined for polynucleotide expression were: adrenal gland 
(AG), bone marrow (BM), brain (Br), cancerous protate (CP), cerebellum (Ce), colon (Co), 
dystrophic muscle (DM), fetal brain (FB), fetal kidney (FK), fetal liver (FL), heart (He), 
hypertrophic prostate (HP), kidney (Ki), liver (Li), lung (Lu), lung cells (LC), lymph ganglia (LG), 
lymphocytes (Ly), muscle (Mu), Ovary (Ov), pancreas (Pa), pituitary gland (PG), placenta (PI), 

25 prostate (Pr), salivary gland (SG), spinal cord (SC), spleen (Sp), stomach/intestine (SI), substantia 
nigra (SN), testis (Te), thyroid (Ty), umbilical cord (UC) and uterus (Ut). 

For each cDNA referred to by its sequence identification number (first column), the number 
of proprietary 5'ESTs (i.e. cDNA fragments) expressed in a particular tissue referred to by its name 
is indicated after a semi column (second column). In addition, the bias in the spatial distribution of 

30 the polynucleotide sequences of the present invention was examined by comparing the relative 
proportions of the biological polynucleotides of a given tissue using the following statistical 
analysis. The under- or over-representation of a polynucleotide of a given cluster in a given tissue 
was performed using the normal approximation of the binomial distribution. When the observed 
proportion of a polynucleotide of a given tissue in a given consensus had less than 1% chance to 

35 occur randomly according to the chi2 test, the frequency bias was reported as "preferred". The 

results are given in Table X as follows. For each polynucleotide showing a bias in tissue distribution 
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as referred to by its sequence identification number in the first column, the list of tissues where the 
polynucleotides are under-represented is given in the second column entitled "low frequency 
expression" and the list of tissues where the polynucleotides are over-represented is given in the 
third column entitled "high frequency expression". 
5 The cellular localization of some polypeptides of the invention was also determined using 

the "psort software" (Nakai, and Horton, (1999); Nakai and Kanehisa, (1992), which disclosures are 
hereby incorporated by reference in their entireties). For each polypeptide identified by its 
sequence identification number in the first column, the second column of Table XI list the predicted 
subcellular localization. 

1 0 Evaluation of Expression Levels and Patterns of GENSET mRNAs 

The spatial and temporal expression patterns of GENSET mRNAs, as well as their 
expression levels, may also be further determined as follows. 

Expression levels and patterns of GENSET mRNAs may be analyzed by solution 
hybridization with long probes as described in International Patent Application No. WO 97/05277, 

15 the entire contents of which are hereby incorporated by reference. Briefly, a GENSET 
polynucleotide, or fragment thereof corresponding to the gene encoding the mRNA to be 
characterized is inserted at a cloning site immediately downstream of a bacteriophage (T3, T7 or 
SP6) RNA polymerase promoter to produce antisense RNA. Preferably, the GENSET 
polynucleotide is at least a 100 nucleotides in length. The plasmid is linearized and transcribed in 

20 the presence of ribonucleotides comprising modified ribonucleotides (i.e. biotin-UTP and DIG- 
UTP). An excess of this doubly labeled RNA is hybridized in solution with mRNA isolated from 
cells or tissues of interest. The hybridizations are performed under standard stringent conditions 
(40-50°C for 16 hours in an 80% formamide, 0.4 M NaCl buffer, pH 7-8). The unhybridized probe 
is removed by digestion with ribonucleases specific for single-stranded RNA (i.e. RNases CL3, Tl, 

25 Phy M, U2 or A). The presence of the biotin-UTP modification enables capture of the hybrid on a 
microtitration plate coated with streptavidin. The presence of the DIG modification enables the 
hybrid to be detected and quantified by ELISA using an anti-DIG antibody coupled to alkaline 
phosphatase. 

The GENSET cDNAs, or fragments thereof may also be tagged with nucleotide sequences 
30 for the serial analysis of gene expression (SAGE) as disclosed in UK Patent Application No. 2 305 
241 A, the entire contents of which are incorporated by reference. In this method, cDNAs are 
prepared from a cell, tissue, organism or other source of nucleic acid for which it is desired to 
determine gene expression patterns. The resulting cDNAs are separated into two pools. The 
cDNAs in each pool are cleaved with a first restriction endonuclease, called an "anchoring enzyme," 
35 having a recognition site which is likely to be present at least once in most cDNAs. The fragments 
which contain the 5' or V most region of the cleaved cDNA are isolated by binding to a capture 
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medium such as streptavidin coated beads. A first oligonucleotide linker having a first sequence for 
hybridization of an amplification primer and an internal restriction site for a "tagging endonuclease" 
is ligated to the digested cDNAs in the first pool. Digestion with the second endonuclease produces 
short "tag" fragments from the cDNAs. A second oligonucleotide having a second sequence for 
5 hybridization of an amplification primer and an internal restriction site is ligated to the digested 
cDNAs in the second pool. The cDNA fragments in the second pool are also digested with the 
"tagging endonuclease" to generate short "tag" fragments derived from the cDNAs in the second 
pool. The "tags" resulting from digestion of the first and second pools with the anchoring enzyme 
and the tagging endonuclease are ligated to one another to produce "ditags." In some embodiments, 

10 the ditags are concatamerized to produce ligation products containing from 2 to 200 ditags. The tag 
sequences are then determined and compared to the sequences of the GENSET cDNAs to determine 
which genes are expressed in the cell, tissue, organism, or other source of nucleic acids from which 
the tags were derived. In this way, the expression pattern of a GENSET gene in the cell, tissue, 
organism, or other source of nucleic acids is obtained. 

15 Quantitative analysis of GENSET gene expression may also be performed using arrays. For 

example, quantitative analysis of gene expression may be performed with GENSET 
polynucleotides, or fragments thereof in a complementary DNA microarray as described by Schena 
et al (1995 and 1996) which disclosures are hereby incorporated by reference in their entireties. 
GENSET cDNAs or fragments thereof are amplified by PCR and arrayed from 96-well microtiter 

20 plates onto silylated microscope slides using high-speed robotics. Printed arrays are incubated in a 
humid chamber to allow rehydration of the array elements and rinsed, once in 0.2% SDS for 1 min, 
twice in water for 1 min and once for 5 min in sodium borohydride solution. The arrays are 
submerged in water for 2 min at 95°C, transferred into 0.2% SDS for 1 min, rinsed twice with 
water, air dried and stored in the dark at 25°C. Cell or tissue mRNA is isolated or commercially 

25 obtained and probes are prepared by a single round of reverse transcription. Probes are hybridized 
to 1 cm 2 microarrays under a 14 x 14 mm glass coverslip for 6-12 hours at 60°C. Arrays are 
washed for 5 min at 25°C in low stringency wash buffer (IX SSC/0.2% SDS), then for 10 min at 
room temperature in high stringency wash buffer (0. IX SSC/0.2% SDS). Arrays are scanned in 
0.1X SSC using a fluorescence laser scanning device fitted with a custom filter set. Accurate 

30 differential expression measurements are obtained by taking the average of the ratios of two 
independent hybridizations. 

Quantitative analysis of the expression of genes may also be performed with GENSET 
cDNAs or fragments thereof in complementary DNA arrays as described by Pietu et al. (1996), 
which disclosure is hereby incorporated by reference in its entirety. The GENSET polynucleotides 

35 of the invention or fragments thereof are PCR amplified and spotted on membranes. Then, mRNAs 
originating from various tissues or cells are labeled with radioactive nucleotides. After 
hybridization and washing in controlled conditions, the hybridized mRNAs are detected by 
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phospho-imaging or autoradiography. Duplicate experiments are performed and a quantitative 
analysis of differentially expressed mRNAs is then performed. 

Alternatively, expression analysis of GENSET genes can be done through high density 
nucleotide arrays as described by Lockhart et al. (1996) and Sosnowski et al. (1997), which 
5 disclosures are hereby incorporated by reference in their entireties. Oligonucleotides of 15-50 
nucleotides corresponding to sequences of a GENSET polynucleotide or fragments thereof are 
synthesized directly on the chip (Lockhart et al, supra) or synthesized and then addressed to the 
chip (Sosnowski et aL, supra). Preferably, the oligonucleotides are about 20 nucleotides in length. 
cDNA probes labeled with an appropriate compound, such as biotin, digoxigenin or fluorescent dye, 

10 are synthesized from the appropriate iriRNA population and then randomly fragmented to an 
average size of 50 to 100 nucleotides. The said probes are then hybridized to the chip. After 
washing as described in Lockhart et aL, (supra) and application of different electric fields 
(Sosnowsky et aL, supra), the dyes or labeling compounds are detected and quantified. Duplicate 
hybridizations are performed. Comparative analysis of the intensity of the signal originating from 

15 cDNA probes on the same target oligonucleotide in different cDNA samples indicates a differential 
expression of the GENSET mRNA. 

Uses of GENSET expression data 

Once the expression levels and patterns of a GENSET mRNA has been determined using 
any technique known to those skilled in the art, in particular those described in the section entitled 
20 "Evaluation of Expression Levels and Patterns of GENSET mRNAs", or using the instant 
disclosure, these information may be used to design GENSET specific markers for detection, 
identification, screening and diagnosis purposes as well as to design DNA constructs with an 
expression pattern similar to a GENSET expression pattern. 

Detection of GENSET expression and/or biological activity 

25 The invention further relates to methods of detection of GENSET expression and/or 

biological activity in a biological sample using the polynucleotide and polypeptide sequences 
described herein. Such method scan be used, for example, as a screen for normal or abnormal 
GENSET expression and/or biological activity and, thus, can be used diagnostically. The biological 
sample for use in the methods of the present invention includes a suitable sample from, for example, 

30 a mammal, particularly a human. For example, the sample can be issued from tissues or cell lines 
having the same origin as tissues or cell lines in which the polypeptide is known to be expressed 
using the data from Table DC. 

Detection of GENSET products 

The invention further relates to methods of detection of GENSET polynucleotides or 
35 polypeptides in a sample using the sequences described herein and any techniques known to those 
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skilled in the art. For example, a labeled polynucleotide probe having all or a functional portion of 
the nucleotide sequence of a GENSET polynucleotide can be used in a method to detect a GENSET 
polynucleotide in a sample. In one embodiment, the sample is treated to render the polynucleotides 
in the sample available for hybridization to a polynucleotide probe, which can be DNA or RNA. 
5 The resulting treated sample is combined with a labeled polynucleotide probe having all or a portion 
of the nucleotide sequence of the GENSET cDNA or genomic sequence, under conditions 
appropriate for hybridization of complementary sequences to occur. Detection of hybridization of 
polynucleotides from the sample with the labeled nucleic probe indicates the presence of GENSET 
polynucleotides in a sample. The presence of GENSET mRNA is indicative of GENSET 
10 expression. 

Consequently, the invention comprises methods for detecting the presence of a 
polynucleotide comprising a nucleotide sequence selected from a group consisting of the sequences 
of SEQ ID Nos: 1-241, the sequences of clone inserts of the deposited clone pool, sequences fully 
complementary thereto, fragments and variants thereof in a sample. In a first embodiment, said 
15 method comprises the following steps of: 

a) bringing into contact said sample and a nucleic acid probe or a plurality of nucleic acid 
probes which hybridize to said selected nucleotide sequence; and 

b) detecting the hybrid complex formed between said probe or said plurality of probes and 
said polynucleotide. 

20 In a preferred embodiment of the above detection method, said nucleic acid probe or said 

plurality of nucleic acid probes is labeled with a detectable molecule. In another preferred 
embodiment of the above detection method, said nucleic acid probe or said plurality of nucleic acid 
probes has been immobilized on a substrate. In still another preferred embodiment, said nucleic 
acid probe or said plurality of nucleic acid probes has a sequence comprised in a sequence 

25 complementary to said selected sequence. 

In a second embodiment, said method comprises the following steps of: 
a) contacting said sample with amplification reaction reagents comprising a pair of 
amplification primers located on either side of the region of said nucleotide sequence to be 
amplified; 

30 b) performing an amplification reaction to synthesize amplification products containing said 

region of said selected nucleotide sequence; and 

c) detecting said amplification products. 

In a preferred embodiment of the above detection method, when the polynucleotide to be 
amplified is a RNA molecule, preliminary reverse transcription and synthesis of a second cDNA 
35 strand are necessary to provide a DNA template to be amplified. In another preferred embodiment 
of the above detection method, the amplification product is detected by hybridization with a labeled 
probe having a sequence which is complementary to the amplified region. In still another preferred 
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embodiment, at least one of said amplification primer has a sequence comprised in said selected 
sequence or in the sequence complementary to said selected sequence. 

Alternatively, a method of detecting GENSET expression in a test sample can be 
accomplished using any product which binds to a GENSET polypeptide of the present invention or 
5 a portion of a GENSET polypeptide. Such products may be antibodies, binding fragments of 
antibodies, polypeptides able to bind specifically to GENSET polypeptides or fragments thereof, 
including GENSET agonists and antagonists. Detection of specific binding to the antibody indicates 
the presence of a GENSET polypeptide in the sample (e.g., ELISA). 

Consequently, the invention is also directed to a method for detecting specifically the 
10 presence of a GENSET polypeptide according to the invention in a biological sample, said method 
comprising the following steps of: 

a) bringing into contact said biological sample with a product able to bind to a polypeptide 
of the invention or fragments thereof; 

b) allowing said product to bind to said polypeptide to form a complex; and 
15 b) detecting said complex. 

In a preferred embodiment of the above detection method, the product is an antibody. In a 
more preferred embodiment, said antibody is labeled with a detectable molecule. In another more 
preferred embodiment of the above detection method, said antibody has been immobilized on a 
substrate. 

20 In addition, the invention also relates to methods of determining whether a GENSET 

product (e.g. a polynucleotide or polypeptide) is present or absent in a biological sample, said 
methods comprising the steps of: 

a) obtaining said biological sample from a human or non-human animal, preferably a 
mammal; 

25 b) contacting said biological sample with a product able to bind to a GENSET 

polynucleotide or polypeptide of the invention; and 

c) determining the presence or absence of said GENSET product in said biological sample. 
Compounds that specifically binds a GENSET product may either be compounds binding to 

a GENSET polypeptide (e.g. binding proteins, antibodies or binding fragments thereof (e.g. F(ab')2 
30 fragments) or compounds bindint to a GENSET polynucleotide (e.g. a complementary probe or 
primer). 

The present invention also relates to kits that can be used in the detection of GENSET 
expression products. The kit can comprise a compound that specifically binds a GENSET 
polypeptide (e.g. binding proteins, antibodies or binding fragments thereof (e.g. F(ab')2 fragments) 
35 or a GENSET iriRNA (e.g. a complementary probe or primer), for example, disposed within a 
container means. The kit can further comprise ancillary reagents, including buffers and the like. 



Detection of a GENSET biological activity 
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The invention further includes methods of detecting specifically a GENSET biological 
activity. Assessing the GENSET biological activity may be performed using a variety of 
techniques, including those described in the section entitled "Erreur! Source du renvoi 
introuvable.". 

5 Consequently, the invention is directed to a method for detecting specifically GENSET 

biological activity in a biological sample, said method comprising the following steps: 

a) obtaining a biological sample from a human or non-human mammal; and 

b) detecting a GENSET biological activity. 

The present invention also relates to kits that can be used in the detection of GENSET 
10 biological activity. 

Identification of a specific context of GENSET expression 

When the expression pattern of a GENSET iriRNA shows that a GENSET gene is 
specifically expressed in a given context, probes and primers specific for this gene as well as 
antibodies binding to the GENSET polynucleotide may then be used as markers for a specific 

15 context. Examples of specific contexts are: specific expression in a given tissue/cell or tissue/cell 
type, expression at a given stage of development of a process such as embryo development or 
disease development, or specific expression in a given organelle. Such primers, probes, and 
antibodies are useful commercially to identify tissues/cells/organelles of unknown origin, for 
example, forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, 

20 or to differentiate different tissue types in a tissue cross-section using any technique known to those 
skilled in the art including in situ PCR or immunochemistry for example. 

For example, the cDNAs and proteins of the sequence listing and fragments thereof, may be 
used to distinguish human tissues/cells from non-human tissues/cells and to distinguish between 
human tissues/cells/organelles that do and do not express the polynucleotides comprising the 

25 cDNAs. By knowing the expression pattern of a given GENSET, either through routine 

experimentation or by using the instant disclosure, the polynucleotides and polypeptides of the 
present invention may be used in methods of determining the identity of an unknown tissue/cell 
sample/organelle. As part of determining the identity of an unknown tissue/cell sample/organelle, 
the polynucleotides and polypeptides of the present invention may be used to determine what the 

30 unknown tissue/cell sample is and what the unknown sample is not. For example, if a cDNA is 
expressed in a particular tissue/cell type/organelle, and the unknown tissue/cell sample/organelle 
does not express the cDNA, it may be inferred that the unknown tissue/cells are either not human or 
not the same human tissue/cell type/organelle as that which expresses the cDNA. These methods of 
determining tissue/cell/organelle identity are based on methods which detect the presence or 

35 absence of the iriRNA (or corresponding cDNA) in a tissue/cell sample using methods well know in 
the art (e.g., hybridization, PCR based methods, immunoassays, immunochemistry, ELISA). 
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Examples of such techniques are described in more detail below. Therefore, the invention 
encompasses uses of the polynucleotides and polypeptides of the invention as tissue markers. In a 
preferred embodiment, polynucleotides preferentially expressed in given tissues as indicated in 
Table X and polypeptides encoded by such polynucleotides are used for this purpose. The 
5 invention also encompasses uses of polypeptides of the invention as organelle markers. In a 
preferred embodiment, polypeptides preferentially expressed in given subcellular compartment as 
indicated in Table XI are used for this purpose. 



Consequently, the present invention encompasses methods of identification of a tissue/cell 
10 type/subcellular compartment, wherein said method includes the steps of: 

a) contacting a biological sample which identity is to be assayed with a product able to bind 
a GENSET product; and 

b) determining whether a GENSET product is expressed in said biological sample. 
Products that are able to bind specifically to a GENSET product, namely a GENSET 

15 polypeptide or a GENSET iriRNA, include GENSET binding proteins, antibodies or binding 

fragments thereof (e.g. F(ab f )2 fragments), as well as GENSET complementary probes and primers. 

Step b) may be performed using any detection method known to those skilled in the art 
including those disclosed herein, especially in the section entitled "Detection of GENSET 
expression and/or biological activity".. 

20 Identification of Tissue Types or Cell Species by Means of Labeled Tissue Specific Antibodies 

Identification of specific tissues is accomplished by the visualization of tissue specific 
antigens by means of antibody preparations which are conjugated, directly (e.g., green fluorescent 
protein) or indirectly to a detectable marker. Selected labeled antibody species bind to their specific 
antigen binding partner in tissue sections, cell suspensions, or in extracts of soluble proteins from a 

25 tissue sample to provide a pattern for qualitative or semi-qualitative interpretation. 

Antisera for these procedures must have a potency exceeding that of the native preparation, 
and for that reason, antibodies are concentrated to a mg/ml level by isolation of the gamma globulin 
fraction, for example, by ion-exchange chromatography or by ammonium sulfate fractionation. 
Also, to provide the most specific antisera, unwanted antibodies, for example to common proteins, 

30 must be removed from the gamma globulin fraction, for example by means of insoluble 

immunoabsorbents, before the antibodies are labeled with the marker. Either monoclonal or 
heterologous antisera is suitable for either procedure. 

A. Immunohistochemical Techniques 

Purified, high-titer antibodies, prepared as described above, are conjugated to a detectable 

35 marker, as described, for example, by Fudenberg, (1980) or Rose et aL, (1980), which disclosures 

are hereby incorporated by reference in their entireties. 
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A fluorescent marker, either fluorescein or rhodamine, is preferred, but antibodies can also 
be labeled with an enzyme that supports a color producing reaction with a substrate, such as 
horseradish peroxidase. Markers can be added to tissue-bound antibody in a second step, as 
described below. Alternatively, the specific anti-tissue antibodies can be labeled with ferritin or 
5 other electron dense particles, and localization of the ferritin coupled antigen-antibody complexes 
achieved by means of an electron microscope. In yet another approach, the antibodies are 
radiolabeled, with, for example 125 I, and detected by overlaying the antibody treated preparation 
with photographic emulsion. Preparations to carry out the procedures can comprise monoclonal or 
polyclonal antibodies to a single protein or peptide identified as specific to a tissue type, for 

10 example, brain tissue, or antibody preparations to several antigenically distinct tissue specific 

antigens can be used in panels, independently or in mixtures, as required. Tissue sections and cell 
suspensions are prepared for immunohistochemical examination according to common histological 
techniques. Multiple cryostat sections (about 4 um, unfixed) of the unknown tissue and known 
control, are mounted and each slide covered with different dilutions of the antibody preparation. 

15 Sections of known and unknown tissues should also be treated with preparations to provide a 
positive control, a negative control, for example, pre-immune sera, and a control for non-specific 
staining, for example, buffer. Treated sections are incubated in a humid chamber for 30 min at 
room temperature, rinsed, then washed in buffer for 30-45 min. Excess fluid is blotted away, and 
the marker developed. If the tissue specific antibody was not labeled in the first incubation, it can 

20 be labeled at this time in a second antibody-antibody reaction, for example, by adding fluorescein- 
or enzyme-conjugated antibody against the immunoglobulin class of the antiserum-producing 
species, for example, fluorescein labeled antibody to mouse IgG. Such labeled sera are 
commercially available. The antigen found in the tissues by the above procedure can be quantified 
by measuring the intensity of color or fluorescence on the tissue section, and calibrating that signal 

25 using appropriate standards. 

B. Identification of Tissue Specific Soluble Proteins 

The visualization of tissue specific proteins and identification of unknown tissues from that 
procedure is carried out using the labeled antibody reagents and detection strategy as described for 
immunohistochemistry; however the sample is prepared according to an electrophoretic technique 

30 to distribute the proteins extracted from the tissue in an orderly array on the basis of molecular 

weight for detection. A tissue sample is homogenized using a Virtis apparatus; cell suspensions are 
disrupted by Dounce homogenization or osmotic lysis, using detergents in either case as required to 
disrupt cell membranes, as is the practice in the art. Insoluble cell components such as nuclei, 
microsomes, and membrane fragments are removed by ultracentrifugation, and the soluble protein- 

35 containing fraction concentrated if necessary and reserved for analysis. A sample of the soluble 
protein solution is resolved into individual protein species by conventional SDS polyacrylamide 
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electrophoresis as described, for example, by Davis et al, Section 19-2 (1986), using a range of 
amounts of polyacrylamide in a set of gels to resolve the entire molecular weight range of proteins 
to be detected in the sample. A size marker is run in parallel for purposes of estimating molecular 
weights of the constituent proteins. Sample size for analysis is a convenient volume of from 5 to55 
5 ul, and containing from about 1 to 100 ug protein. An aliquot of each of the resolved proteins is 
transferred by blotting to a nitrocellulose filter paper, a process that maintains the pattern of 
resolution. Multiple copies are prepared. The procedure, known as Western Blot Analysis, is well 
described in Davis et al, (1986) Section 19-3. One set of nitrocellulose blots is stained with 
Coomassie Blue dye to visualize the entire set of proteins for comparison with the antibody bound 

10 proteins. The remaining nitrocellulose filters are then incubated with a solution of one or more 
specific antisera to tissue specific proteins prepared as described herein. In this procedure, as in 
procedure A above, appropriate positive and negative sample and reagent controls are run. 

In either procedure A or B, a detectable label can be attached to the primary tissue antigen- 
primary antibody complex according to various strategies and permutations thereof. In a 

15 straightforward approach, the primary specific antibody can be labeled; alternatively, the unlabeled 
complex can be bound by a labeled secondary anti-IgG antibody. In other approaches, either the 
primary or secondary antibody is conjugated to a biotin molecule, which can, in a subsequent step, 
bind an avidin conjugated marker. According to yet another strategy, enzyme labeled or radioactive 
protein A, which has the property of binding to any IgG, is bound in a final step to either the 

20 primary or secondary antibody. The visualization of tissue specific antigen binding at levels above 
those seen in control tissues to one or more tissue specific antibodies, prepared from the gene 
sequences identified from cDNA sequences, can identify tissues of unknown origin, for example, 
forensic samples, or differentiated tumor tissue that has metastasized to foreign bodily sites. 

Targeting of compounds to subcellular compartments 

25 GENSET Polypeptides expressed in specific cellular compartments/organelels may also be 

used to target compounds to these compartments/organelles. The invention therefore encompasses 
uses of polypeptides and polynucleotides of the invention as organelle targeting tools. 

In a first embodiment, GENSET polypeptides expressed in mitochondria may be used to 
target heterologous compounds, either polypeptides or polynucleotides to mitochondria by 

30 recombinantly or chemically fusing a fragment of the protein of the invention to an heterologous 
polypeptide or polynucleotide. Preferred fragments are signal peptide, amphiphilic alpha helices 
and/or any other fragments of the protein of the invention, or part thereof, that may contain 
targeting signals for mitochondria including but not limited to matrix targeting signals as defined in 
Herrman and Neupert, (2000); Bhagwat et al. (1999), Murphy (1997); Glaser et al. (1998); 

35 Ciminale et al. (1 999), which disclosures are hereby incorporated by reference in their entireties. 
Such heterologous compounds may be used to modulate mitochondria's activities. For example, 
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they may be used to induce and/or prevent mitochondrial-induced apoptosis or necrosis. In 
addition, heterologous polynucleotides may be used for mitochondrial gene therapy to replace a 
defective mitochondrial gene and/or to inhibit the deleterious expression of a mitochondrial gene. 

In a second embodiment, GENSET polypeptides expressed in the endoplasmic reticulum may 
5 be used to target heterologous polypeptides to the endoplasmic reticulum by recombinantly or 

chemically fusing a fragment of the proteins of the invention to an heterologous polypeptide. Preferred 
fragments are any fragments of the proteins of the invention, or part thereof, that may contain targeting 
signals for the endoplasmic reticulum such as those described in Pidoux and Armstrong (1992), Munro 
and Pelham (1987); Pelham (1990), which disclosures are hereby incorporated by reference in their 
10 entireties. 

In a third embodiment, GENSET polypeptides expressed in the nucleus may be used to target 
heterologous polypeptides or polynucleotides to the nucleus by recombinantly or chemically fusing a 
fragment of the proteins or polynuleotide of the invention to an heterologous polypeptide or 
polynucleotide. Preferred fragments are any fragments of the proteins or polynuclotide of the 

1 5 invention, or part thereof, that may contain targeting signals for the nucleus (nuclear localization 

signals) such as those described in Christophe et al. ( 2000), which disclosure is hereby incorporated by 
reference in its entirety. 

In a fourth embodiment, GENSET polypeptides expressed in the nucleus may be used to 
target heterologous polypeptides to the Golgi apparatus by recombinantly or chemically fusing a 

20 fragment of the protein of the invention to an heterologous polypeptide. Preferred fragments are 
signal peptide, transmembrane domains, tyrosine containing regions and/or any other fragments of 
the proteins of the invention, or part thereof, that may contain (1) targeting signals for the Golgi 
apparatus such as the ones described in Ugur and Jones, (2000); Picetti and Borrelli, (2000), (2) 
tyrosine-based Golgi targeting signal region (Zhan et al., (1998); Watson and Pessin (2000); Ward 

25 and Moss (2000), or (3) any other region as defined in Munro, (1 998); Luetterforst et al., (1999); 
Essl et al., (1999), which disclosures are hereby incorporated by reference in their entireties. 

Screening and diagnosis of abnormal GENSET expression and/or biological activity 

Moreover, antibodies and/or primers specific for GENSET expression may also be used to 
identify abnormal GENSET expression and/or biological activity, and subsequently to screen and/or 

30 diagnose disorders associated with abnormal GENSET expression. For example, a particular 
disease may result from lack of expression, over expression, or under expression of a GENSET 
mRNA. By comparing mRNA expression patterns and quantities in samples taken from healthy 
individuals with those from individuals suffering from a particular disorder, genes responsible for 
this disorder may be identified. Primers, probes and antibodies specific for this GENSET may then 

35 be used to elaborate kits of screening and diagnosis for a disorder in which the gene of interest is 
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specifically expressed or in which its expression is specifically dysregulated, i.e. underexpressed or 
overexpressed. 



Screening for specific disorders 

The present invention also relates to methods of identifying individuals having elevated or 
5 reduced levels of GENSET, which individuals are likely to benefit from therapies to suppress or 
enhance GENSET expression, respectively. One example of such methods comprises the steps of: 

a) obtaining from a human or non-human mammal a biological sample; 

b) detecting the presence in said sample of a GENSET product (mRNA or protein) using 
any method known to those skilled in the art including those described herein, especially at the 

1 0 section entitled "Detection of GENSET products"; 

c) comparing the amount of said GENSET product present in said sample with that of a 
control sample; and 

d) determing whether said human or non-human mammal has a reduced or elevated level of 
GENSET expression compared to the control sample. 

15 Such individuals with reduced or elevated levels of GENSET products may be predisposed 

to disorders associated with dyregulation of GENSET gene expression and thus would be 
candidates for therapies. The identification of elevated levels of GENSET in a patient would be 
indicative of an individual that would benefit from treatment with agents that suppress GENSET 
expression or activity. The identification of low levels of GENSET in a patient would be indicative 

20 of an individual that would benefit from agents that induce GENSET expression or activity. 

Biological samples suitable for use in this method include biological fluids such as blood, 
lymph, saliva, sperm, maternal milk, and tissue samples (e.g. biopsies ) as well as cell cultures or 
cell extracts derived, for example, from tissue biopsies. The detection step of the present method 
can be performed using standard protocols for protein/mRNA detection. Examples of suitable 

25 protocols include Northern blot analysis, immunoassays (e.g. RIA, Western blots, 
immunohistochemical analyses), and PCR. 

Thus, the present invention further relates to methods of identifying individuals or non- 
human animals at increased risk for developing, or present state of having, certain 
diseases/disorders associated with GENSET abnormal expression or biological activity. One 

30 example of such methods comprises the steps of: 

a) obtaining from a human or non-human mammal a biological sample; 

b) detecting the presence or absence in said sample of a GENSET product (mRNA or 
protein); 

c) comparing the amount of said GENSET product present in said sample with that of a 
35 control sample; and 
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d) determing whether said human or non-human mammal is at increased risk for 
developing, or present state of having, a diseases or disorder. 

In accordance with this method, the presence in the sample of altered levels of GENSET 
product indicates that the subject is predisposed to the above -indicated diseases/disorders. 
5 Biological samples suitable for use in this method include biological fluids such as blood, lymph, 
saliva, sperm, maternal milk, and tissue samples (e.g. biopsies. 

The diagnostic methodologies described herein are applicable to both humans and 
non-human mammals. 

Detection of GENSET mutations 

10 The invention also encompasses methods to detect mutations in GENSET polynucleotides 

of the invention. Such methods may advantageously be used to detect mutations occurring in 
GENSET genes and preferably in their regulatory regions. When the mutation was proven to be 
associated with a disease, screening for such mutations may be used for screening and diagnosis 
purposes. 

15 In one embodiment of the oligonucleotide arrays of the invention, an oligonucleotide probe 

matrix may advantageously be used to detect mutations occurring in GENSET genes and preferably 
in their regulatory regions. For this particular purpose, probes are specifically designed to have a 
nucleotide sequence allowing their hybridization to the genes that carry known mutations (either by 
deletion, insertion or substitution of one or several nucleotides). By known mutations, it is meant, 

20 mutations on the GENSET genes that have been identified according, for example to the technique 
used by Huang et a/.(1996) or Samson et al.{\ 996), which disclosures are hereby incorporated by 
reference in their entireties. 

Another technique that is used to detect mutations in GENSET genes is the use of a high- 
density DNA array. Each oligonucleotide probe constituting a unit element of the high density 

25 DNA array is designed to match a specific subsequence of a GENSET genomic DNA or cDNA. 
Thus, an array consisting of oligonucleotides complementary to subsequences of the target gene 
sequence is used to determine the identity of the target sequence with the wild gene sequence, 
measure its amount, and detect differences between the target sequence and the reference wild gene 
sequence of the GENSET gene. In one such design, termed 4L tiled array, is implemented a set of 

30 four probes (A, C, G, T), preferably 15-nucleotide oligomers. In each set of four probes, the perfect 
complement will hybridize more strongly than mismatched probes. Consequently, a nucleic acid 
target of length L is scanned for mutations with a tiled array containing 4L probes, the whole probe 
set containing all the possible mutations in the known wild reference sequence. The hybridization 
signals of the 15-mer probe set tiled array are perturbed by a single base change in the target 

35 sequence. As a consequence, there is a characteristic loss of signal or a "footprint" for the probes 
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flanking a mutation position. This technique was described by Chee et aL in 1996, which disclosure 
is hereby incorporated by reference in its entirety. 



Construction of DNA constructs with a GENSET expression pattern 

In addition, characterization of the spatial and temporal expression patterns and expression 
5 levels of GENSET mRNAs is also useful for constructing expression vectors capable of producing a 
desired level of gene product in a desired spatial or temporal manner, as discussed below. 

DNA Construct That Enables Directing Temporal And Spatial GENSET Gene Expression In 
Recombinant Cell Hosts And In Transgenic Animals. 

In order to study the physiological and phenotypic consequences of a lack of synthesis of a 

10 GENSET protein, both at the cell level and at the multi cellular organism level, the invention also 
encompasses DNA constructs and recombinant vectors enabling a conditional expression of a 
specific allele of a GENSET genomic sequence or cDNA and also of a copy of this genomic 
sequence or cDNA harboring substitutions, deletions, or additions of one or more bases as regards 
to a nucleotide sequence selected from the group consisting of sequences of SEQ ID Nos 1-241 and 

15 sequences of clone inserts of the deposited clone pool, or a fragment thereof, these base 

substitutions, deletions or additions being located either in an exon, an intron or a regulatory 
sequence, but preferably in the S'-regulatory sequence or in an exon of the GENSET genomic 
sequence or within the GENSET cDNA. 

A first preferred DNA construct is based on the tetracycline resistance operon tet from E. 

20 coli transposon TnlO for controlling the GENSET gene expression, such as described by Gossen et 
a/.(1992, 1995) and Furth et a/.(1994), which disclosures are hereby incorporated by reference in 
their entireties. Such a DNA construct contains seven tet operator sequences from TnlO (/e/op) that 
are fused to either a minimal promoter or a 5 '-regulatory sequence of the GENSET gene, said 
minimal promoter or said GENSET regulatory sequence being operably linked to a polynucleotide 

25 of interest that codes either for a sense or an antisense oligonucleotide or for a polypeptide, 

including a GENSET polypeptide or a peptide fragment thereof. This DNA construct is functional 
as a conditional expression system for the nucleotide sequence of interest when the same cell also 
comprises a nucleotide sequence coding for either the wild type (tTA) or the mutant (rTA) repressor 
fused o the activating domain of viral protein VP 16 of herpes simplex virus, placed under the 

30 control of a promoter, such as the HCMVIE1 enhancer/promoter or the MMTV-LTR. Indeed, a 
preferred DNA construct of the invention comprise both the polynucleotide containing the tet 
operator sequences and the polynucleotide containing a sequence coding for the tTA or the rTA 
repressor. In a specific embodiment, the conditional expression DNA construct contains the 
sequence encoding the mutant tetracycline repressor rTA, the expression of the polynucleotide of 

35 interest is silent in the absence of tetracycline and induced in its presence. 
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DNA Constructs Allowing Homologous Recombination: Replacement Vectors 

A second preferred DNA construct will comprise, from 5 '-end to 3 '-end: (a) a first 
nucleotide sequence that is comprised in the GENSET genomic sequence; (b) a nucleotide sequence 
comprising a positive selection marker, such as the marker for neomycine resistance (neo); and (c) a 
5 second nucleotide sequence that is comprised in the GENSET genomic sequence, and is located on 
the genome downstream the first GENSET nucleotide sequence (a). 

In a preferred embodiment, this DNA construct also comprises a negative selection marker 
located upstream the nucleotide sequence (a) or downstream the nucleotide sequence (c). 
Preferably, the negative selection marker comprises the thymidine kinase {tk) gene (Thomas et aL, 
10 1 986), the hygromycine beta gene (Te Riele et aL , 1 990), the hprt gene ( Van der Lugt et aL, 1 991 ; 
Reid et aL, 1990) or the Diphteria toxin A fragment (Dt-A) gene (Nada et aL, 1993; Yagi et 
aL 1990), which disclosures are hereby incorporated by reference in their entireties. Preferably, the 
positive selection marker is located within a GENSET exon sequence so as to interrupt the sequence 
encoding a GENSET protein. These replacement vectors are described, for example, by Thomas et 
15 a/.(1986; 1987), Mansour et aL (1988) and Roller et aL (1992). 

The first and second nucleotide sequences (a) and (c) may be indifferently located within a 
GENSET regulatory sequence, an intronic sequence, an exon sequence or a sequence containing 
both regulatory and/or intronic and/or exon sequences. The size of the nucleotide sequences (a) and 
(c) ranges from 1 to 50 kb, preferably from 1 to 10 kb, more preferably from 2 to 6 kb and most 
20 preferably from 2 to 4 kb. 

DNA Constructs Allowing Homologous Recombination: Cre-LoxP System. 

These new DNA constructs make use of the site specific recombination system of the PI 
phage. The PI phage possesses a recombinase called Cre which interacts specifically with a 34 
base pairs lox? site. The lox? site is composed of two palindromic sequences of 13 bp separated by 

25 a 8 bp conserved sequence (Hoess et aL, 1986), which disclosure is hereby incorporated by 

reference in its entirety. The recombination by the Cre enzyme between two lox? sites having an 
identical orientation leads to the deletion of the DNA fragment. 

The Cre-/oxP system used in combination with a homologous recombination technique has 
been first described by Gu et aL (1993, 1994), which disclosures are hereby incorporated by 

30 reference in their entireties. Briefly, a nucleotide sequence of interest to be inserted in a targeted 
location of the genome harbors at least two lox? sites in the same orientation and located at the 
respective ends of a nucleotide sequence to be excised from the recombinant genome. The excision 
event requires the presence of the recombinase (Cre) enzyme within the nucleus of the recombinant 
cell host. The recombinase enzyme may be brought at the desired time either by (a) incubating the 

35 recombinant cell hosts in a culture medium containing this enzyme, by injecting the Cre enzyme 
directly into the desired cell, such as described by Araki et aL (1995), which disclosure is hereby 
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incorporated by reference in its entirety, or by lipofection of the enzyme into the cells, such as 
described by Baubonis et al. (1993), which disclosure is hereby incorporated by reference in its 
entirety; (b) transfecting the cell host with a vector comprising the Cre coding sequence operably 
linked to a promoter functional in the recombinant cell host, which promoter being optionally 
5 inducible, said vector being introduced in the recombinant cell host, such as described by Gu et 
<z/.(1993) and Sauer et a/.(1988), which disclosures are hereby incorporated by reference in their 
entireties; (c) introducing in the genome of the cell host a polynucleotide comprising the Cre coding 
sequence operably linked to a promoter functional in the recombinant cell host, which promoter is 
optionally inducible, and said polynucleotide being inserted in the genome of the cell host either by 
10 a random insertion event or an homologous recombination event, such as described by Gu et 
al (1994). 

In a specific embodiment, the vector containing the sequence to be inserted in the GENSET 
gene by homologous recombination is constructed in such a way that selectable markers are flanked 
by loxP sites of the same orientation, it is possible, by treatment by the Cre enzyme, to eliminate the 

1 5 selectable markers while leaving the GENSET sequences of interest that have been inserted by an 
homologous recombination event. Again, two selectable markers are needed: a positive selection 
marker to select for the recombination event and a negative selection marker to select for the 
homologous recombination event. Vectors and methods using the Cre-ZoxP system are described by 
Zou et a/.(1994), which disclosure is hereby incorporated by reference in its entirety. 

20 Thus, a third preferred DNA construct of the invention comprises, from 5 '-end to 3 '-end: 

(a) a first nucleotide sequence that is comprised in the GENSET genomic sequence; (b) a nucleotide 
sequence comprising a polynucleotide encoding a positive selection marker, said nucleotide 
sequence comprising additionally two sequences defining a site recognized by a recombinase, such 
as a loxP site, the two sites being placed in the same orientation; and (c) a second nucleotide 

25 sequence that is comprised in the GENSET genomic sequence, and is located on the genome 
downstream of the first GENSET nucleotide sequence (a). 

The sequences defining a site recognized by a recombinase, such as a loxP site, are 
preferably located within the nucleotide sequence (b) at suitable locations bordering the nucleotide 
sequence for which the conditional excision is sought. In one specific embodiment, two loxP sites 

30 are located at each side of the positive selection marker sequence, in order to allow its excision at a 
desired time after the occurrence of the homologous recombination event. 

In a preferred embodiment of a method using the third DNA construct described above, the 
excision of the polynucleotide fragment bordered by the two sites recognized by a recombinase, 
preferably two loxP sites, is performed at a desired time, due to the presence within the genome of 

35 the recombinant host cell of a sequence encoding the Cre enzyme operably linked to a promoter 
sequence, preferably an inducible promoter, more preferably a tissue-specific promoter sequence 
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and most preferably a promoter sequence which is both inducible and tissue-specific, such as 
described by Gu et al ( 1 994) . 

The presence of the Cre enzyme within the genome of the recombinant cell host may result 
from the breeding of two transgenic animals, the first transgenic animal bearing the GENSET - 
5 derived sequence of interest containing the loxP sites as described above and the second transgenic 
animal bearing the Cre coding sequence operably linked to a suitable promoter sequence, such as 
described by Gu et al. (1994). 

Spatio-temporal control of the Cre enzyme expression may also be achieved with an 
adenovirus based vector that contains the Cre gene thus allowing infection of cells, or in vivo 
10 infection of organs, for delivery of the Cre enzyme, such as described by Anton and Graham (1995) 
and Kanegae et <?/.(1995), which disclosures are hereby incorporated by reference in their entireties. 

The DNA constructs described above may be used to introduce a desired nucleotide 
sequence of the invention, preferably a GENSET genomic sequence or a GENSET cDNA sequence, 
and most preferably an altered copy of a GENSET genomic or cDNA sequence, within a 
15 predetermined location of the targeted genome, leading either to the generation of an altered copy of 
a targeted gene (knock-out homologous recombination) or to the replacement of a copy of the 
targeted gene by another copy sufficiently homologous to allow an homologous recombination 
event to occur (knock-in homologous recombination). 

Modifying GENSET expression and/or biological activity 

20 Modifying endogenous GENSET expression and/or biological activity is expressly 

contemplated by the present invention. 

Screening for compounds that modulate GENSET expression and/or biological activity 

The present invention further relates to compounds able to modulate GENSET expression 
and/or biological activity and methods to use these compounds. Such compounds may interact with 
25 the regulatory sequences of GENSET genes or they may interact with GENSET polypeptides 
directly or indirectly. 

Compounds Interacting With GENSET Regulatory Sequences 

The present invention also concerns a method for screening substances or molecules that are 
able to interact with the regulatory sequences of a GENSET gene, such as for example promoter or 
30 enhancer sequences in untranscribed regions of the genomic DNA, as determined using any 
techniques known to those skilled in the art including those described in the section entitled 
"Identification of Promoters in Cloned Upstream Sequences, or such as regulatory sequences 
located in untranslated regions of GENSET mRNA. 

Sequences within untranscribed or untranslated regions of polynucleotides of the invention 
35 may be identified by comparison to databases containing known regulatory sequence such as 
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transcription start sites, transcription factor binding sites, promoter sequences, enhancer sequences, 
5'UTR and 3'UTR elements (Pesole et al. y 2000; http://igs-server.cnrs- 

mrs.fr/-gauthere/UTR/index.html). Alternatively, the regulatory sequences of interest may be 
identified through conventional mutagenesis or deletion analyses of reporter plasmids using, for 
5 instance, techniques described in the section entitled "Identification of Promoters in Cloned 
Upstream Sequences". 

Following the identification of potential GENSET regulatory sequences, proteins which 
interact with these regulatory sequences may be identified as described below. 

Gel retardation assays may be performed independently in order to screen candidate 

10 molecules that are able to interact with the regulatory sequences of the GENSET gene, such as 

described by Fried and Crothers (1981), Garner and Revzin (1981) and Dent and Latchman (1993), 
the teachings of these publications being herein incorporated by reference. These techniques are 
based on the principle according to which a DNA or mRNA fragment which is bound to a protein 
migrates slower than the same unbound DNA or mRNA fragment. Briefly, the target nucleotide 

15 sequence is labeled. Then the labeled target nucleotide sequence is brought into contact with either 
a total nuclear extract from cells containing regulation factors, or with different candidate molecules 
to be tested. The interaction between the target regulatory sequence of the GENSET gene and the 
candidate molecule or the regulation factor is detected after gel or capillary electrophoresis through 
a retardation in the migration. 

20 Nucleic acids encoding proteins which are able to interact with the promoter sequence of 

the GENSET gene, more particularly a nucleotide sequence selected from the group consisting of 
the polynucleotides of the 5' and 3' regulatory region or a fragment or variant thereof, may be 
identified by using a one-hybrid system, such as that described in the booklet enclosed in the 
Matchmaker One-Hybrid System kit from Clontech (Catalog Ref. n° K1603-1), the technical 

25 teachings of which are herein incorporated by reference. Briefly, the target nucleotide sequence is 
cloned upstream of a selectable reporter sequence and the resulting polynucleotide construct is 
integrated in the yeast genome (Saccharomyces cerevisiae). Preferably, multiple copies of the 
target sequences are inserted into the reporter plasmid in tandem. The yeast cells containing the 
reporter sequence in their genome are then transformed with a library comprising fusion molecules 

30 between cDNAs encoding candidate proteins for binding onto the regulatory sequences of the 

GENSET gene and sequences encoding the activator domain of a yeast transcription factor such as 
GAL4. The recombinant yeast cells are plated in a culture broth for selecting cells expressing the 
reporter sequence. The recombinant yeast cells thus selected contain a fusion protein that is able to 
bind onto the target regulatory sequence of the GENSET gene. Then, the cDNAs encoding the 

35 fusion proteins are sequenced and may be cloned into expression or transcription vectors in vitro. 
The binding of the encoded polypeptides to the target regulatory sequences of the GENSET gene 
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may be confirmed by techniques familiar to the one skilled in the art, such as gel retardation assays 
or DNAse protection assays. 



Ligands interacting with GENSET polypeptides 

For the purpose of the present invention, a ligand means a molecule, such as a protein, a 
5 peptide, an antibody or any synthetic chemical compound capable of binding to a GENSET protein 
or one of its fragments or variants or to modulate the expression of the polynucleotide coding for 
GENSET or a fragment or variant thereof. 

In the ligand screening method according to the present invention, a biological sample or a 
defined molecule to be tested as a putative ligand of a GENSET protein is brought into contact with 

10 the corresponding purified GENSET protein, for example the corresponding purified recombinant 
GENSET protein produced by a recombinant cell host as described herein, in order to form a 
complex between this protein and the putative ligand molecule to be tested. 

As an illustrative example, to study the interaction of a GENSET protein, or a fragment 
comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more 

15 preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the 
group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID 
Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool, with drugs or small molecules, such as molecules generated 
through combinatorial chemistry approaches, the microdialysis coupled to HPLC method described 

20 by Wang et al. (1997) or the affinity capillary electrophoresis method described by Bush et al. 
(1997), the disclosures of which are incorporated by reference, can be used. 

In further methods, peptides, drugs, fatty acids, lipoproteins, or small molecules which 
interact with a GENSET protein, or a fragment comprising a contiguous span of at least 6 amino 
acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 

25 100 amino acids of a polypeptide selected from the group consisting of sequences of SEQ ID Nos: 
242-482, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, as well as full-length 
and mature polypeptides encoded by the clone inserts of the deposited clone pool may be identified 
using assays such as the following. The molecule to be tested for binding is labeled with a 
detectable label, such as a fluorescent , radioactive, or enzymatic tag and placed in contact with 

30 immobilized GENSET protein, or a fragment thereof under conditions which permit specific 

binding to occur. After removal of non-specifically bound molecules, bound molecules are detected 
using appropriate means. 

Various candidate substances or molecules can be assayed for interaction with a GENSET 
polypeptide. These substances or molecules include, without being limited to, natural or synthetic 

35 organic compounds or molecules of biological origin such as polypeptides. When the candidate 
substance or molecule comprises a polypeptide, this polypeptide may be the resulting expression 
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product of a phage clone belonging to a phage-based random peptide library, or alternatively the 
polypeptide may be the resulting expression product of a cDNA library cloned in a vector suitable 
for performing a two-hybrid screening assay. 

A. Candidate ligands obtained from random peptide libraries 
5 In a particular embodiment of the screening method, the putative ligand is the expression 

product of a DNA insert contained in a phage vector (Parmley and Smith, 1988). Specifically, 
random peptide phages libraries are used. The random DNA inserts encode for peptides of 8 to 20 
amino acids in length (Oldenburg et al., 1992; Valadon et al., 1996; Lucas, 1994; Westerink, 1995; 
Felici et al., 1991), which disclosures are hereby incorporated by reference in their entireties. 

10 According to this particular embodiment, the recombinant phages expressing a protein that binds to 
an immobilized GENSET protein is retained and the complex formed between the GENSET protein 
and the recombinant phage may be subsequently immunoprecipitated by a polyclonal or a 
monoclonal antibody directed against the GENSET protein. 

Once the ligand library in recombinant phages has been constructed, the phage population is 

15 brought into contact with the immobilized GENSET protein. Then the preparation of complexes is 
washed in order to remove the non-specifically bound recombinant phages. The phages that bind 
specifically to the GENSET protein are then eluted by a buffer (acid pH) or immunoprecipitated by 
the monoclonal antibody produced by the hybridoma anti-GENSET, and this phage population is 
subsequently amplified by an over-infection of bacteria (for example E. coli). The selection step 

20 may be repeated several times, preferably 2-4 times, in order to select the more specific 

recombinant phage clones. The last step comprises characterizing the peptide produced by the 
selected recombinant phage clones either by expression in infected bacteria and isolation, 
expressing the phage insert in another host-vector system, or sequencing the insert contained in the 
selected recombinant phages. 

25 B. Candidate ligands obtained by competition experiments. 

Alternatively, peptides, drugs or small molecules which bind to a GENSET protein or 
fragment thereof comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 
amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide 
selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 

30 included in SEQ ID Nos: 242-272 and 274-384, as well as full-length and mature polypeptides 
encoded by the clone inserts of the deposited clone pool, may be identified in competition 
experiments. In such assays, the GENSET protein, or a fragment thereof, is immobilized to a 
surface, such as a plastic plate. Increasing amounts of the peptides, drugs or small molecules are 
placed in contact with the immobilized GENSET protein, or a fragment thereof, in the presence of a 

35 detectable labeled known GENSET protein ligand. For example, the GENSET ligand may be 

detectably labeled with a fluorescent, radioactive, or enzymatic tag. The ability of the test molecule 
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to bind the GENSET protein, or a fragment thereof, is determined by measuring the amount of 
detectably labeled known ligand bound in the presence of the test molecule. A decrease in the 
amount of known ligand bound to the GENSET protein, or a fragment thereof, when the test 
molecule is present indicated that the test molecule is able to bind to the GENSET protein, or a 
5 fragment thereof. 

C. Candidate ligands obtained by affinity chromatography. 

Proteins or other molecules interacting with a GENSET protein, or a fragment thereof 
comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more 
preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the 

10 group consisting of sequences of SEQ ED Nos: 242-482, mature polypeptides included in SEQ ID 
Nos: 242-272 and 274-384, as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool, can also be found using affinity columns which contain the 
GENSET protein, or a fragment thereof. The GENSET protein, or a fragment thereof, may be 
attached to the column using conventional techniques including chemical coupling to a suitable 

15 column matrix such as agarose, Affi Gel® , or other matrices familiar to those of skill in art. In 
some embodiments of this method, the affinity column contains chimeric proteins in which the 
GENSET protein, or a fragment thereof, is fused to glutathion S transferase (GST). A mixture of 
cellular proteins or pool of expressed proteins as described above is applied to the affinity column. 
Proteins or other molecules interacting with the GENSET protein, or a fragment thereof, attached to 

20 the column can then be isolated and analyzed on 2-D electrophoresis gel as described in Ramunsen 
et al. (1997), the disclosure of which is incorporated by reference. Alternatively, the proteins 
retained on the affinity column can be purified by electrophoresis based methods and sequenced. 
The same method can be used to isolate antibodies, to screen phage display products, or to screen 
phage display human antibodies. 

25 D. Candidate ligands obtained by optical biosensor methods 

Proteins interacting with a GENSET protein, or a fragment comprising a contiguous span of 
at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 
25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the group consisting of sequences 
of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, as 

30 well as full-length and mature polypeptides encoded by the clone inserts of the deposited clone 
pool, can also be screened by using an Optical Biosensor as described in Edwards and 
Leatherbarrow (1997) and also in Szabo et al. (1995), the disclosures of which are incorporated by 
reference. This technique permits the detection of interactions between molecules in real time, 
without the need of labeled molecules. This technique is based on the surface plasmon resonance 

35 (SPR) phenomenon. Briefly, the candidate ligand molecule to be tested is attached to a surface 
(such as a carboxymethyl dextran matrix). A light beam is directed towards the side of the surface 
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that does not contain the sample to be tested and is reflected by said surface. The SPR phenomenon 
causes a decrease in the intensity of the reflected light with a specific association of angle and 
wavelength. The binding of candidate ligand molecules cause a change in the refraction index on 
the surface, which change is detected as a change in the SPR signal. For screening of candidate 
5 ligand molecules or substances that are able to interact with the GENSET protein, or a fragment 
thereof, the GENSET protein, or a fragment thereof, is immobilized onto a surface. This surface 
comprises one side of a cell through which flows the candidate molecule to be assayed. The 
binding of the candidate molecule on the GENSET protein, or a fragment thereof, is detected as a 
change of the SPR signal. The candidate molecules tested may be proteins, peptides, carbohydrates, 

1 0 lipids, or small molecules generated by combinatorial chemistry. This technique may also be 
performed by immobilizing eukaryotic or prokaryotic cells or lipid vesicles exhibiting an 
endogenous or a recombinantly expressed GENSET protein at their surface. 

The main advantage of the method is that it allows the determination of the association rate 
between the GENSET protein and molecules interacting with the GENSET protein. It is thus 

15 possible to select specifically ligand molecules interacting with the GENSET protein, or a fragment 
thereof, through strong or conversely weak association constants. 

E. Candidate ligands obtained through a two-hybrid screening assay. 

The yeast two-hybrid system is designed to study protein-protein interactions in vivo (Fields 
and Song, 1989), which disclosure is hereby incorporated by reference in its entirety, and relies 
20 upon the fusion of a bait protein to the DNA binding domain of the yeast Gal4 protein. This 

technique is also described in the US Patent N° US 5,667,973 and the US Patent N° 5,283,173, the 
technical teachings of both patents being herein incorporated by reference. 

The general procedure of library screening by the two-hybrid assay may be performed as 
described by Harper et aL (1993) or as described by Cho et al. (1998) or also Fromont-Racine et al. 
25 (1997), which disclosures are hereby incorporated by reference in their entireties. 

The bait protein or polypeptide comprises, consists essentially of, or consists of a GENSET 
polypeptide or a fragment thereof comprising a contiguous span of at least 6 amino acids, preferably 
at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids 
of a polypeptide selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature 
30 polypeptides included in SEQ ID Nos: 242-272 and 274-384, as well as full-length and mature 
polypeptides encoded by the clone inserts of the deposited clone pool. 

More precisely, the nucleotide sequence encoding the GENSET polypeptide or a fragment 
or variant thereof is fused to a polynucleotide encoding the DNA binding domain of the GAL4 
protein, the fused nucleotide sequence being inserted in a suitable expression vector, for example 
35 pAS2orpM3. 
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Then, a human cDNA library is constructed in a specially designed vector, such that the 
human cDNA insert is fused to a nucleotide sequence in the vector that encodes the transcriptional 
domain of the GAL4 protein. Preferably, the vector used is the pACT vector. The polypeptides 
encoded by the nucleotide inserts of the human cDNA library are termed "pray" polypeptides. 
5 A third vector contains a detectable marker gene, such as beta galactosidase gene or CAT 

gene that is placed under the control of a regulation sequence that is responsive to the binding of a 
complete Gal4 protein containing both the transcriptional activation domain and the DNA binding 
domain. For example, the vector pG5EC may be used. 

Two different yeast strains are also used. As an illustrative but non limiting example the 
1 0 two different yeast strains may be the followings : 

- Y190, the phenotype of which is (MATa, Leu2-3, 1 12 ura3-12, trpl-901, his3-D200, 
ade2-101, gal4Dgall80D URA3 GAL-LacZ, LYS GAL-HIS3, cyh 1 ); 

- Y187, the phenotype of which is (MATa gal4 gal80 his3 trpl-901 ade2-101 ura3-52 leu2- 
3,-112 URA3 GAL-lacZmet"), which is the opposite mating type of Y 190. 

15 Briefly, 20 ng of pAS2/GENSET and 20 \ig of pACT-cDNA library are co-transformed 

into yeast strain Y190. The transformants are selected for growth on minimal media lacking 
histidine, leucine and tryptophan, but containing the histidine synthesis inhibitor 3-AT (50 mM). 
Positive colonies are screened for beta galactosidase by filter lift assay. The double positive 
colonies (His + , beta-gal + ) are then grown on plates lacking histidine, leucine, but containing 

20 tryptophan and cycloheximide (10 mg/ml) to select for loss of pAS2/GENSET plasmids but 

retention of pACT-cDNA library plasmids. The resulting Y190 strains are mated with Yl 87 strains 
expressing GENSET or non-related control proteins; such as cyclophilin B, lamin, or SNF1, as Gal4 
fusions as described by Harper et al. (1993) and by Bram et aL (1993), which disclosures are hereby 
incorporated by reference in their entireties, and screened for beta galactosidase by filter lift assay. 

25 Yeast clones that are beta gal- after mating with the control Gal4 fusions are considered false 
positives. 

In another embodiment of the two-hybrid method according to the invention, interaction 
between the GENSET or a fragment or variant thereof with cellular proteins may be assessed using 
the Matchmaker Two Hybrid System 2 (Catalog No. K 1604-1, Clontech). As described in the 

30 manual accompanying the kit, the disclosure of which is incorporated herein by reference, nucleic 
acids encoding the GENSET protein or a portion thereof, are inserted into an expression vector such 
that they are in frame with DNA encoding the DNA binding domain of the yeast transcriptional 
activator GAL4. A desired cDNA, preferably human cDNA, is inserted into a second expression 
vector such that they are in frame with DNA encoding the activation domain of GAL4. The two 

35 expression plasmids are transformed into yeast and the yeast are plated on selection medium which 
selects for expression of selectable markers on each of the expression vectors as well as GAL4 
dependent expression of the HIS3 gene. Transformants capable of growing on medium lacking 
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histidine are screened for GAL4 dependent lacZ expression. Those cells which are positive in both the 
histidine selection and the lacZ assay contain interaction between GEN SET and the protein or peptide 
encoded by the initially selected cDNA insert. 

Compounds Modulating GEN SET biological activity 
5 Another method of screening for compounds that modulate GENSET gene expression 

and/or biological activity is by measuring the effects of test compounds on a given cellular property 
in a host cell, such as apoptosis, proliferation, differentiation, protein glycosylation, etc... using a 
variety of techniques known to those skilled in the art including those described herein and 
especially in the section entitled "Erreur! Source du renvoi introuvable.". 

10 In one embodiment, the present invention relates to a method of identifying an agent which 

alters GENSET activity, wherein a nucleic acid construct comprising a nucleic acid which encodes 
a mammalian GENSET polypeptide is introduced into a host cell. The host cells produced are 
maintained under conditions appropriate for expression of the encoded mammalian GENSET 
polypeptides, whereby the nucleic acid is expressed. The host cells are then contacted with a 

15 compound to be assessed (an agent) and the given cellular property of the cells is detected in the 
presence of the compound to be assessed. Detection of a change in the given cellular property in 
the presence of the agent indicates that the agent alters GENSET activity. 

In a particular embodiment, the invention relates to a method of identifying an agent which 
is an activator of GENSET activity, wherein detection of a change of the given cellular property in 

20 the presence of the agent indicates that the agent activates GENSET activity. In another particular 
embodiment, the invention relates to a method of identifying an agent which is an inhibitor of 
GENSET activity, wherein detection of a change of the given cellular property in the presence of 
the agent indicates that the agent inhibits GENSET activity. 

Methods of Screening for Compounds Modulating GENSET Expression and/or Activity 
25 The present invention also relates to methods of screening compounds for their ability to 

modulate (e.g. increase or inhibit) the activity or expression of GENSET. More specifically, the 
present invention relates to methods of testing compounds for their ability either to increase or to 
decrease expression or activity of GENSET. The assays are performed in vitro or in vivo. 

In vitro methods 

30 In vitro, cells expressing GENSET are incubated in the presence and absence of the test 

compound. By determining the level of GENSET expression in the presence of the test compound 
or the level of GENSET activity in the presence of the test compound, compounds can be identified 
that suppress or enhance GENSET expression or activity. Alternatively, constructs comprising a 
GENSET regulatory sequence operably linked to a reporter gene (e.g. luciferase, chloramphenicol 

35 acetyl transferase, LacZ, green fluorescent protein, etc.) can be introduced into host cells and the 
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effect of the test compounds on expression of the reporter gene detected. Cells suitable for use in 
the foregoing assays include, but are not limited to, cells having the same origin as tissues or cell 
lines in which the polypeptide is known to be expressed using the data from Table DC. 

Consequently, the present invention encompasses a method for screening molecules that 
5 modulate the expression of a GENSET gene, said screening method comprising the steps of: 

a) cultivating a prokaryotic or an eukaryotic cell that has been transfected with a nucleotide 
sequence encoding a GENSET protein or a variant or a fragment thereof, placed under the control 
of its own promoter; 

b) bringing into contact said cultivated cell with a molecule to be tested; 

10 c) quantifying the expression of said GENSET protein or a variant or a fragment thereof in 

the presence of said molecule. 

Using DNA recombination techniques well known by the one skill in the art, the GENSET 
protein encoding DNA sequence is inserted into an expression vector, downstream from its 
promoter sequence. As an illustrative example, the promoter sequence of the GENSET gene is 
15 contained in the 5' untranscribed region of the GENSET genomic DNA. 

The quantification of the expression of a GENSET protein may be realized either at the 
mRNA level (using for example Northen blots, RT-PCR, preferably quantitative RT-PCR with 
primers and probes specific for the GENSET mRNA of interest) or at the protein level (using 
polyclonal or monoclonal antibodies in immunoassays such as ELISA or RIA assays, Western blots, 
20 or immunochemistry). 

The present invention also concerns a method for screening substances or molecules that are 
able to increase, or in contrast to decrease, the level of expression of a GENSET gene. Such a 
method may allow the one skilled in the art to select substances exerting a regulating effect on the 
expression level of a GENSET gene and which may be useful as active ingredients included in 
25 pharmaceutical compositions for treating patients suffering from disorders associated with abnormal 
levels of GENSET products. 

Thus, also part of the present invention is a method for screening a candidate molecule that 
modulates the expression of a GENSET gene, this method comprises the following steps: 

a) providing a recombinant cell host containing a nucleic acid, wherein said nucleic acid 
30 comprises a GENSET 5' regulatory region or a regulatory active fragment or variant thereof, 

operably linked to a polynucleotide encoding a detectable protein; 

b) obtaining a candidate molecule; and 

c) determining the ability of said candidate molecule to modulate the expression levels of 
said polynucleotide encoding the detectable protein. 

35 In a further embodiment, said nucleic acid comprising a GENSET 5' regulatory region or a 

regulatory active fragment or variant thereof, includes the 5'UTR region of a GENSET cDNA 
selected from the group comprising of the 5'UTRs of the sequences of SEQ ID Nos 1-241, 
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sequences of clones inserts of the deposited clone pool, regulatory active fragments and variants 
thereof. In a more preferred embodiment of the above screening method, said nucleic acid includes 
a promoter sequence which is endogenous with respect to the GENSET 5'UTR sequence. In 
another more preferred embodiment of the above screening method, said nucleic acid includes a 
5 promoter sequence which is exogenous with respect to the GENSET 5'UTR sequence defined 
therein. 

Preferred polynucleotides encoding a detectable protein are polynucleotides encoding beta 
galactosidase, green fluorescent protein (GFP) and chloramphenicol acetyl transferase (CAT). 

The invention further relates to a method for the production of a pharmaceutical 
10 composition comprising a method of screening a candidate molecule that modulates the expression 
of a GENSET gene and furthermore mixing the identified molecule with a pharmaceutic ally 
acceptable carrier. 

The invention also pertains to kits for the screening of a candidate substance modulating the 
expression of a GENSET gene. Preferably, such kits comprise a recombinant vector that allows the 
15 expression of a GENSET 5 ' regulatory region or a regulatory active fragment or a variant thereof, 
operably linked to a polynucleotide encoding a detectable protein or a GENSET protein or a 
fragment or a variant thereof. More preferably, such kits include a recombinant vector that 
comprises a nucleic acid including the 5'UTR region of a GENSET cDNA selected from the group 
comprising the 5'UTRs of the sequences of SEQ ID Nos 1-241, sequences of clones inserts of the 
20 deposited clone pool, regulatory active fragments and variants thereof, being operably linked to a 
polynucleotide encoding a detectable protein. 

For the design of suitable recombinant vectors useful for performing the screening methods 
described above, it will be referred to the section of the present specification wherein the preferred 
recombinant vectors of the invention are detailed. 
25 Another object of the present invention comprises methods and kits for the screening of 

candidate substances that interact with a GENSET polypeptide, fragments or variants thereof. By 
their capacity to bind covalently or non-covalently to a GENSET protein, fragments or variants 
thereof, these substances or molecules may be advantageously used both in vitro and in vivo. 

In vitro, said interacting molecules may be used as detection means in order to identify the 
30 presence of a GENSET protein in a sample, preferably a biological sample. 

A method for the screening of a candidate substance that interact with a GENSET 
polypeptide, fragments or variants thereof, said methods comprising the following steps: 

a) providing a polypeptide comprising, consisting essentially of, or consisting of a GENSET 
protein or a fragment comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 
35 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a 

polypeptide selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 
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included in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the 
clone inserts of the deposited clone pool; 

b) obtaining a candidate substance; 

c) bringing into contact said polypeptide with said candidate substance; 

5 d) detecting the complexes formed between said polypeptide and said candidate substance. 

The invention further relates to a method for the production of a pharmaceutical 
composition comprising a method for the screening of a candidate substance that interact with a 
GENSET polypeptide, fragments or variants thereof and furthermore mixing the identified 
substance with a pharmaceutically acceptable carrier. 
1 0 The invention further concerns a kit for the screening of a candidate substance interacting 

with the GENSET polypeptide, wherein said kit comprises: 

a) a polypeptide comprising, consisting essentially of, or consisting of a GENSET protein or 
a fragment comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino 
acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide 

15 selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides included 
in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool; and 

b) optionally means useful to detect the complex formed between said polypeptide or a 
variant thereof and the candidate substance. 

20 In a preferred embodiment of the kit described above, the detection means comprises a 

monoclonal or polyclonal antibody binding to said GENSET protein or fragment or variant thereof. 

In vivo methods 

Compounds that suppress or enhance GENSET expression can also be identified using in 
vivo screens. In these assays, the test compound is administered (e.g. IV, IP, IM, orally, or 

25 otherwise), to the animal, for example, at a variety of dose levels. The effect of the compound on 
GENSET expression is determined by comparing GENSET levels, for example in tissues known to 
express the gene of interest using, for example the data obtained in Table DC, and using Northern 
blots, immunoassays, PCR, etc., as described above. Suitable test animals include rodents (e.g., 
mice and rats), primates, mammals. Humanized mice can also be used as test animals, that is mice 

30 in which the endogenous mouse protein is ablated (knocked out) and the homologous human 

protein added back by standard transgenic approaches. Such mice express only the human form of 
a protein. Humanized mice expressing only the human GENSET can be used to study in vivo 
responses to potential agents regulating GENSET protein or mRNA levels. As an example, 
transgenic mice have been produced carrying the human apoE4 gene. They are then bred with a 

35 mouse line that lacks endogenous apoE, to produce an animal model carrying human proteins 

believed to be instrumental in development of Alzheimer's pathology. Such transgenic animals are 

useful for dissecting the biochemical and physiological steps of disease, and for development of 
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therapies for disease intervention (Loring, et al, 1996) (incorporated herein by reference in its 
entirety). 



Uses for compounds modulating GENSET expression and/or biological activity 

Using in vivo (or in vitro) systems, it may be possible to identify compounds that exert a 
5 tissue specific effect, for example, that increase GENSET expression or activity only in tissues of 
interest. Screening procedures such as those described above are also useful for identifying agents 
for their potential use in pharmacological intervention strategies. Agents that enhance GENSET 
expression or stimulate its activity may thus be used to treat disorders which require upregulated 
levels of GENSET gene expression and/or activity. Compounds that suppress GENSET expression 

10 or inhibit its activity can be used to treat disorders which require downregulated levels of GENSET 
gene expression and/or activity. 

Also encompassed by the present invention is an agent which interacts with GENSET 
directly or indirectly, and inhibits or enhances GENSET expression and/or function. In one 
embodiment, the agent is an inhibitor which interferes with GENSET directly (e.g., by binding 

1 5 GENSET) or indirectly (e.g., by blocking the ability of GENSET to have a GENSET biological 
activity). In a particular embodiment, an inhibitor of GENSET protein is an antibody specific for 
GENSET protein or a functional portion of GENSET; that is, the antibody binds a GENSET 
polypeptide. For example, the antibody can be specific for a polypeptide encoded by one of the 
amino acid sequences of human GENSET genes (SEQ ID Nos: 242-482, mature polypeptides 

20 included in SEQ ID Nos: 242-272 and 274-384, full-length and mature polypeptides encoded by the 
clone inserts of the deposited clone pool), mammal GENSET or portions thereof. Alternatively, the 
inhibitor can be an agent other than an antibody (e.g., small organic molecule, protein or peptide) 
which binds GENSET and blocks its activity. For example, the inhibitor can be an agent which 
mimics GENSET structurally, but lacks its function. Alternatively, it can be an agent which binds 

25 to or interacts with a molecule which GENSET normally binds with or interacts with, thus blocking 
GENSET from doing so and preventing it from exerting the effects it would normally exert. 

In another embodiment, the agent is an enhancer (activator) of GENSET which increases 
the activity of GENSET (increases the effect of a given amount or level of GENSET), increases the 
length of time it is effective (by preventing its degradation or otherwise prolonging the time during 

30 which it is active) or both either directly or indirectly. 

The GENSET sequences of the present invention can also be used to generate nonhuman 
gene knockout animals, such as mice, which lack a GENSET gene or transgenically overexpress 
GENSET. For example, such GENSET gene knockout mice can be generated and used to obtain 
further insight into the function of GENSET as well as assess the specificity of GENSET activators 

35 and inhibitors. Also, over expression of GENSET (e.g., human GENSET) in transgenic mice can 
be used as a means of creating a test system for GENSET activators and inhibitors (e.g., against 
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human GENSET). In addition, the GENSET gene can be used to clone the GENSET 
promoter/enhancer in order to identify regulators of GENSET transcription. GENSET gene 
knockout animals include animals which completely or partially lack the GENSET gene and/or 
GENSET activity or function. Thus the present invention relates to a method of inhibiting (partially 
5 or completely) GENSET biological activty in a mammal (e.g., human) comprising administering to 
the mammal an effective amount of an inhibitor of GENSET. The invention also relates to a 
method of enhancing GENSET biological activity in a mammal comprising administering to the 
mammal an effective amount of an enhancer GENSET. 

Inhibiting GENSET expression 

1 0 Therapeutic compositions according to the present invention may comprise advantageously 

one or several GENSET oligonucleotide fragments as an antisense tool or a triple helix tool that 
inhibits the expression of the corresponding GENSET gene. 

Antisense Approach 

In antisense approaches, nucleic acid sequences complementary to an mRNA are hybridized 
15 to the mRNA intracellularly, thereby blocking the expression of the protein encoded by the mRNA. 
The antisense nucleic acid molecules to be used in gene therapy may be either DNA or RNA 
sequences. Preferred methods using antisense polynucleotide according to the present invention are 
the procedures described by Sczakiel et <s/.(1995), which disclosure is hereby incorporated by 
reference in its entirety. 

20 Preferably, the antisense tools are chosen among the polynucleotides (15-200 bp long) that 

are complementary to GENSET mRNA, more preferably to the 5'end of the GENSET mRNA. In 
another embodiment, a combination of different antisense polynucleotides complementary to 
different parts of the desired targeted gene are used. 

Other preferred antisense polynucleotides according to the present invention are sequences 

25 complementary to either a sequence of GENSET mRNAs comprising the translation initiation 

codon ATG or a sequence of GENSET genomic DNA containing a splicing donor or acceptor site. 

Preferably, the antisense polynucleotides of the invention have a 3' polyadenylation signal 
that has been replaced with a self-cleaving ribozyme sequence, such that RNA polymerase II 
transcripts are produced without poly(A) at their 3' ends, these antisense polynucleotides being 

30 incapable of export from the nucleus, such as described by Liu et <?/.(1994), which disclosure is 
hereby incorporated by reference in its entirety. In a preferred embodiment, these GENSET 
antisense polynucleotides also comprise, within the ribozyme cassette, a histone stem-loop structure 
to stabilize cleaved transcripts against 3 '-5' exonucleolytic degradation, such as the structure 
described by Eckner et a/.(1991), which disclosure is hereby incorporated by reference in its 

35 entirety. 
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The antisense nucleic acids should have a length and melting temperature sufficient to 
permit formation of an intracellular duplex having sufficient stability to inhibit the expression of the 
GENSET mRNA in the duplex. Strategies for designing antisense nucleic acids suitable for use in 
gene therapy are disclosed in Green et al. 9 (1986) and Izant and Weintraub, (1984), the disclosures 
5 of which are incorporated herein by reference. 

In some strategies, antisense molecules are obtained by reversing the orientation of the 
GENSET coding region with respect to a promoter so as to transcribe the opposite strand from that 
which is normally transcribed in the cell. The antisense molecules may be transcribed using in vitro 
transcription systems such as those which employ T7 or SP6 polymerase to generate the transcript. 

10 Another approach involves transcription of GENSET antisense nucleic acids in vivo by operably 
linking DNA containing the antisense sequence to a promoter in a suitable expression vector. 

Alternatively, oligonucleotides which are complementary to the strand normally transcribed 
in the cell may be synthesized in vitro. Thus, the antisense nucleic acids are complementary to the 
corresponding mRNA and are capable of hybridizing to the mRNA to create a duplex. In some 

15 embodiments, the antisense sequences may contain modified sugar phosphate backbones to increase 
stability and make them less sensitive to RNase activity. Examples of modifications suitable for use 
in antisense strategies include 2' O-methyl RNA oligonucleotides and Protein-nucleic acid (PNA) 
oligonucleotides. Further examples are described by Rossi et al, (1991), which disclosure is hereby 
incorporated by reference in its entirety. 

20 Various types of antisense oligonucleotides complementary to the sequence of the GENSET 

cDNA or genomic DNA may be used. In one preferred embodiment, stable and semi-stable 
antisense oligonucleotides described in International Application No. PCT WO94/23026, hereby 
incorporated by reference, are used. In these molecules, the 3' end or both the 3' and 5' ends are 
engaged in intramolecular hydrogen bonding between complementary base pairs. These molecules 

25 are better able to withstand exonuclease attacks and exhibit increased stability compared to 
conventional antisense oligonucleotides. 

In another preferred embodiment, the antisense oligodeoxynucleotides against herpes 
simplex virus types 1 and 2 described in International Application No. WO 95/04141, hereby 
incorporated by reference, are used. 

30 In yet another preferred embodiment, the covalently cross-linked antisense oligonucleotides 

described in International Application No. WO 96/31523, hereby incorporated by reference, are 
used. These double- or single-stranded oligonucleotides comprise one or more, respectively, inter- 
or intra-oligonucleotide covalent cross-linkages, wherein the linkage consists of an amide bond 
between a primary amine group of one strand and a carboxyl group of the other strand or of the 

35 same strand, respectively, the primary amine group being directly substituted in the 2' position of 
the strand nucleotide monosaccharide ring, and the carboxyl group being carried by an aliphatic 
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spacer group substituted on a nucleotide or nucleotide analog of the other strand or the same strand, 
respectively. 

The antisense oligodeoxynucleotides and oligonucleotides disclosed in International 
Application No. WO 92/18522, incorporated by reference, may also be used. These molecules are 
5 stable to degradation and contain at least one transcription control recognition sequence which binds 
to control proteins and are effective as decoys therefor. These molecules may contain "hairpin" 
structures, "dumbbell" structures, "modified dumbbell" structures, "cross-linked" decoy structures 
and "loop" structures. 

In another preferred embodiment, the cyclic double-stranded oligonucleotides described in 

10 European Patent Application No. 0 572 287 A2, hereby incorporated by reference are used. These 
ligated oligonucleotide "dumbbells" contain the binding site for a transcription factor and inhibit 
expression of the gene under control of the transcription factor by sequestering the factor. 

Use of the closed antisense oligonucleotides disclosed in International Application No. WO 
92/19732, hereby incorporated by reference, is also contemplated. Because these molecules have 

15 no free ends, they are more resistant to degradation by exonucleases than are conventional 

oligonucleotides. These oligonucleotides may be multifunctional, interacting with several regions 
which are not adjacent to the target mRNA. 

The appropriate level of antisense nucleic acids required to inhibit gene expression may be 
determined using in vitro expression analysis. The antisense molecule may be introduced into the 

20 cells by diffusion, injection, infection or transfection using procedures known in the art. For 
example, the antisense nucleic acids can be introduced into the body as a bare or naked 
oligonucleotide, oligonucleotide encapsulated in lipid, oligonucleotide sequence encapsidated by 
viral protein, or as an oligonucleotide operably linked to a promoter contained in an expression 
vector. The expression vector may be any of a variety of expression vectors known in the art, 

25 including retroviral or viral vectors, vectors capable of extrachromosomal replication, or integrating 
vectors. The vectors may be DNA or RNA. 

The antisense molecules are introduced onto cell samples at a number of different 
concentrations preferably between lxlO" 10 M to lxlO^M. Once the minimum concentration that can 
adequately control gene expression is identified, the optimized dose is translated into a dosage 

30 suitable for use in vivo. For example, an inhibiting concentration in culture of lxlO" 7 translates into 
a dose of approximately 0.6 mg/kg bodyweight. Levels of oligonucleotide approaching 100 mg/kg 
bodyweight or higher may be possible after testing the toxicity of the oligonucleotide in laboratory 
animals. It is additionally contemplated that cells from the vertebrate are removed, treated with the 
antisense oligonucleotide, and reintroduced into the vertebrate. 

35 In a preferred application of this invention, the polypeptide encoded by the gene is first 

identified, so that the effectiveness of antisense inhibition on translation can be monitored using 
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techniques that include but are not limited to antibody-mediated tests such as RIAs and ELISA, 
functional assays, or radiolabeling. 

An alternative to the antisense technology that is used according to the present invention 
comprises using ribozymes that will bind to a target sequence via their complementary 
5 polynucleotide tail and that will cleave the corresponding RNA by hydrolyzing its target site 
(namely "hammerhead ribozymes")- Briefly, the simplified cycle of a hammerhead ribozyme 
comprises (1) sequence specific binding to the target RNA via complementary antisense sequences; 
(2) site-specific hydrolysis of the cleavable motif of the target strand; and (3) release of cleavage 
products, which gives rise to another catalytic cycle. Indeed, the use of long-chain antisense 

10 polynucleotide (at least 30 bases long) or ribozymes with long antisense arms are advantageous. A 
preferred delivery system for antisense ribozyme is achieved by covalently linking these antisense 
ribozymes to lipophilic groups or to use liposomes as a convenient vector. Preferred antisense 
ribozymes according to the present invention are prepared as described by Rossi et a/, (1991) and 
Sczakiel et a/.(1995), the specific preparation procedures being referred to in said articles being 

1 5 herein incorporated by reference. 

Triple Helix Approach 

The GENSET genomic DNA may also be used to inhibit the expression of the GEN SET 
gene based on intracellular triple helix formation. 

Triple helix oligonucleotides are used to inhibit transcription from a genome. They are 

20 particularly useful for studying alterations in cell activity when it is associated with a particular 
gene. The GENSET cDNAs or genomic DNAs of the present invention or, more preferably, a 
fragment of those sequences, can be used to inhibit gene expression in individuals having diseases 
associated with expression of a particular gene. Similarly, a portion of the GENSET genomic DNA 
can be used to study the effect of inhibiting GENSET transcription within a cell. Traditionally, 

25 homopurine sequences were considered the most useful for triple helix strategies. However, 
homopyrimidine sequences can also inhibit gene expression. Such homopyrimidine 
oligonucleotides bind to the major groove at homopurine: homopyrimidine sequences. Thus, both 
types of sequences from the GENSET genomic DNA are contemplated within the scope of this 
invention. 

30 To carry out gene therapy strategies using the triple helix approach, the sequences of the 

GENSET genomic DNA are first scanned to identify 1 0-mer to 20-mer homopyrimidine or 
homopurine stretches which could be used in triple-helix based strategies for inhibiting GENSET 
expression. Following identification of candidate homopyrimidine or homopurine stretches, their 
efficiency in inhibiting GENSET expression is assessed by introducing varying amounts of 

35 oligonucleotides containing the candidate sequences into tissue culture cells which express the 
GENSET gene. 
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The oligonucleotides can be introduced into the cells using a variety of methods known to 
those skilled in the art, including but not limited to calcium phosphate precipitation, DEAE- 
Dextran, electroporation, liposome-mediated transfection or native uptake. 

Treated cells are monitored for altered cell function or reduced GENSET expression using 
5 techniques such as Northern blotting, RNase protection assays, or PCR based strategies to monitor 
the transcription levels of the GENSET gene in cells which have been treated with the 
oligonucleotide. The cell functions to be monitored are predicted based upon the homologies of the 
target gene corresponding to the cDNA from which the oligonucleotide was derived with known 
gene sequences that have been associated with a particular function. The cell functions can also be 
10 predicted based on the presence of abnormal physiology within cells derived from individuals with 
a particular inherited disease, particularly when the cDNA is associated with the disease using 
techniques described in the section entitled "Identification of genes associated with hereditary 
diseases or drug response". 

The oligonucleotides which are effective in inhibiting gene expression in tissue culture cells 
1 5 may then be introduced in vivo using the techniques and at a dosage calculated based on the in vitro 
results, as described in the section entitled "Antisense Approach". 

In some embodiments, the natural (beta) anomers of the oligonucleotide units can be 
replaced with alpha anomers to render the oligonucleotide more resistant to nucleases. Further, an 
intercalating agent such as ethidium bromide, or the like, can be attached to the 3' end of the alpha 
20 oligonucleotide to stabilize the triple helix. For information on the generation of oligonucleotides 
suitable for triple helix formation see Griffin et <z/.(1989), which is hereby incorporated by this 
reference. 

Treating GENS ET-r elated disorders 

The present invention further relates to methods of treating diseases/disorders by increasing 
25 GENSET activity and/or expression. The invention also relates to methods of treating 

diseases/disorders by decreasing GENSET activity and or expression. These methodologies can be 
effected using compounds selected using screening protocols such as those described herein and/or 
by using the gene therapy and antisense approaches described in the art and herein. Gene therapy 
can be used to effect targeted expression of GENSET. The GENSET coding sequence can be 
30 cloned into an appropriate expression vector and targeted to a particular cell type(s) to achieve 

efficient, high level expression. Introduction of the GENSET coding sequence into target cells can 
be achieved, for example, using particle mediated DNA delivery, (Haynes, 1 996 and Maurer, 1 999), 
direct injection of naked DNA, (Levy et al, 1996; and Feigner, 1996), or viral vector mediated 
transport (Smith et al. 9 1996, Stone et al, 2000; Wu and Atai, 2000), each of which disclosures are 
35 hereby incorporated by reference in their entireties . Tissue specific effects can be achieved, for 
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example, in the case of virus mediated transport by using viral vectors that are tissue specific, or by 
the use of promoters that are tissue specific. 

Combinatorial approaches can also be used to ensure that the GENSET coding sequence is 
activated in the target tissue (Butt and Karathanasis, 1995; Miller and Whelan, 1997), which 
5 disclosures are hereby incorporated by reference in their entireties. Antisense oligonucleotides 
complementary to GENSET mRNA can be used to selectively diminish or ablate the expression of 
the protein, for example, at sites of inflammation. More specifically, antisense constructs or 
antisense oligonucleotides can be used to inhibit the production of GENSET in high expressing 
cells such as those cited in the third column of Table X. Antisense mRNA can be produced by 

10 transfecting into target cells an expression vector with the GENSET gene sequence, or portion 
thereof, oriented in an antisense direction relative to the direction of transcription. Appropriate 
vectors include viral vectors, including retroviral, adenoviral, and adeno-associated viral vectors, as 
well as nonviral vectors. Tissue specific promoters can be used. Alternatively, antisense 
oligonucleotides can be introduced directly into target cells to achieve the same goal. (See also 

1 5 other delivery methodologies described herein in connection with gene therapy.). Oligonucleotides 
can be selected/designed to achieve a high level of specificity (Wagner et al, 1996), which 
disclosure is hereby incorporated by reference in its entirety. The therapeutic methodologies 
described herein are applicable to both human and non-human mammals (including cats and dogs). 

Pharmaceutical and physiologically acceptable compositions 

20 The present invention also relates to pharmaceutical or physiologically acceptable 

compositions comprising, as active agent, the polypeptides, nucleic acids or antibodies of the 
invention. The invention also relates to compositions comprising, as active agent, compounds 
selected using the above-described screening protocols. Such compositions include the active agent 
in combination with a pharmaceutical or physiologically acceptably acceptable carrier. In the case 

25 of naked DNA, the "carrier" may be gold particles. The amount of active agent in the composition 
can vary with the agent, the patient and the effect sought. Likewise, the dosing regimen can vary 
depending on the composition and the disease/disorder to be treated. 

Therefore, the invention related to methods for the production of pharmaceutical 
composition comprising a method for selecting an active agent, compound, substance or molecule 

30 using any of the screening method described herein and furthermore mixing the identified active 
agent, compound, substance or molecule with a pharmaceutical^ acceptable carrier. 

The pharmaceutical compositions utilized in this invention may be administered by any 
number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, 
intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, 

35 enteral, topical, sublingual, or rectal means. In addition to the active ingredients, these 

pharmaceutical compositions may contain suitable pharmaceutically acceptable carriers comprising 
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excipients and auxiliaries which facilitate processing of the active compounds into preparations 
which can be used pharmaceutically. Further details on techniques for formulation and 
administration may be found in the latest edition of Remington's Pharmaceutical Sciences (Maack 
PublishingCo. Easton, Pa). 
5 Pharmaceutical compositions for oral administration can be formulated using 

pharmaceutically acceptable carriers well known in the art in dosages suitable for oral 
administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, 
pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the like, for ingestion by the 
patient. 

10 Pharmaceutical preparations for oral use can be obtained through a combination of active 

compounds with solid excipient, suiting mixture is optionally grinding, and processing the mixture 
of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 
excipients are carbohydrate or protein fillers, such as sugars, including lactose, sucrose, mannitol, or 
sorbitol; starch from corn, wheat, rice, potato, or other plants; cellulose, such as methyl cellulose, 

15 hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; gums including arabic and 
tragacanth; and proteins such as gelatin and collagen. If desired, disintegrating or solubilizing 
agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, alginic acid, or a salt 
thereof, such as sodium alginate. 

Dragee cores may be used in conjunction with suitable coatings, such as concentrated sugar 

20 solutions, which may also contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, 

polyethylene glycol, and/or titaniumdioxide, lacquer solutions, and suitable organic solvents or 
solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for product 
identification or to characterize the quantity of active compound, i.e., dosage. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 

25 gelatin, as well as soft, sealed capsules made of gelatin and a coating, such as glycerol or sorbitol. 
Push-fit capsules can contain active ingredients mixed with a filler or binders, such as lactose or 
starches, lubricants, such as talc or magnesium stearate, and, optionally, stabilizers. In soft 
capsules, the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, 
liquid, or liquidpolyethylene glycol with or without stabilizers. 

30 Pharmaceutical formulations suitable for parenteral administration may be formulated in 

aqueous solutions, preferably in physiologically compatible buffers such as Hanks solution, 
Ringer's solution, or physiologically buffered saline. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethylcellulose, 
sorbitol, or dextran. Additionally, suspensions of the active compounds may be prepared as 

35 appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils 
such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. 
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Optionally, the suspension may also contain suitable stabilizers or agents which increase the 
solubility of the compounds to allow for the preparation of highly concentrated solutions. 

For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 
5 The pharmaceutical compositions of the present invention may be manufactured in a 

manner that is known in the art, e.g., by means of conventional mixing, dissolving, granulating, 
dragee-making, levigating, emulsifying, encapsulating, entrapping, or lyophilizing processes. 

The pharmaceutical composition may be provided as a salt and can be formed with many 
acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, etc. 

10 Salts tend to be more soluble in aqueous or other protonic solvents than are the corresponding free 
base forms. In other cases, the preferred preparation may be a lyophilized powder which may 
contain any or all of the following: 1-50 mM histidine, 0.1%-2% sucrose, and 2-7% mannitol, at a 
pH range of 4.5 to 5.5, that is combined with buffer prior to use. 

After pharmaceutical compositions have been prepared, they can be placed in an 

1 5 appropriate container and labeled for treatment of an indicated condition. For administration of 
GENSET, such labeling would include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include compositions wherein 
the active ingredients are contained in an effective amount to achieve the intended purpose. The 
determination of an effective dose is well within the capability of those skilled in the art. 

20 For any compound, the therapeutically effective dose can be estimated initially either in cell 

culture assays, e.g., of neoplastic cells, or in animal models, usually mice, rabbits, dogs, or pigs. 
The animal model may also be used to determine the appropriate concentration range and route of 
administration. Such information can then be used to determine useful doses and routes for 
administration in humans. 

25 A therapeutically effective dose refers to that amount of active ingredient, for example 

GENSET or fragments thereof, antibodies of GENSET, agonists, antagonists or inhibitors of 
GENSET, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be 
determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., 
ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 

30 50% of the population). The dose ratio between therapeutic and toxic effects is the therapeutic 
index, and it can be expressed as the ratio, LD50/ED50. Pharmaceutical compositions which 
exhibit large therapeutic indices are preferred. The data obtained from cell culture assays and 
animal studies is used in formulating a range of dosage for human use. The dosage contained in 
such compositions is preferably within a range of circulating concentrations that include the ED50 

35 with little or no toxicity. The dosage varies within this range depending upon the dosage form 
employed, sensitivity of the patient, and the route of administration. 
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The exact dosage will be determined by the practitioner, in light of factors related to the 
subject that requires treatment. Dosage and administration are adjusted to provide sufficient levels 
of the active moiety or to maintain the desired effect. Factors which may be taken into account 
include the severity of the disease state, general health of the subject, age, weight, and gender of the 
5 subject, diet, time and frequency of administration, drug combination(s), reaction sensitivities, and 
tolerance/response to therapy. Long-acting pharmaceutical compositions maybe administered every 
3 to 4 days, every week, or once every two weeks depending on half-life and clearance rate of the 
particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to a total dose of 
10 about 1 g, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than for proteins or their 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 

15 Uses of GENSET sequences: computer-Rela ted Embodiments 

As used herein the term " cDNA codes of SEQ ID Nos: 1 -241 " encompasses the nucleotide 
sequences of SEQ ID Nos: 1-241 and of clones inserts of the deposited clone pool, fragments 
thereof, nucleotide sequences homologous thereto, and sequences complementary to all of the 
preceding sequences. The fragments include fragments of SEQ ID Nos: 1-241 comprising at least 

20 8, 10, 12, 15, 18,20,25,28,30,35,40,50,75, 100, 150,200,300,400,500, 1000 or 2000 

consecutive nucleotides of SEQ ID Nos: 1 -241 . Preferably the fragments include signal sequences 
and coding sequences for mature polypeptides of SEQ ID Nos: 1-31 and 33-143, polynucleotides 
described in Tables Va and Table Vb, polynucleotides encoding polypeptides described in Table VI, 
polynucleotide described herein as encoding polypeptides having a biological activity, or fragments 

25 comprising at least 8, 10, 12, 15, 18, 20, 25, 28, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, 
1000 or 2000 consecutive nucleotides of the signal sequences or coding sequences for mature 
polypeptides of SEQ ID Nos: 1-31 and 33-143, polynucleotides described in Tables Va and Table 
Vb, polynucleotides encoding polypeptides described in Table VI, and polynucleotide described 
herein as encoding polypeptides having a biological activity. Homologous sequences and fragments 

30 of SEQ ID Nos: 1-241 refer to a sequence having at least 99%, 98%, 97%, 96%, 95%, 90%, 85%, 
80%, or 75% identity to these sequences. Identity may be determined using any of the computer 
programs and parameters described herein, including BLAST2N with the default parameters or with 
any modified parameters. Homologous sequences also include RNA sequences in which uridines 
replace the thymines in the cDNA codes of SEQ ID Nos: 1-241 . The homologous sequences may 

35 be obtained using any of the procedures described herein or may result from the correction of a 
sequencing error as described above. Preferably the homologous sequences and fragments of SEQ 
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ID Nos: 1-241 include polynucleotides homologous to signal sequences and coding sequences for 
mature polypeptides of SEQ ID Nos: 1-3 1 and 33-143, polynucleotides described in Tables Va and 
Table Vb, polynucleotides encoding a polypeptide fragment described as a domain in Table VI, 
polynucleotide described herein as encoding polypeptides having a biological activity, or fragments 
5 comprising at least 8, 10, 12, 15, 18, 20, 25, 28, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, 
1000 or 2000 consecutive nucleotides of the signal sequences and coding sequences for mature 
polypeptides of SEQ ID Nos: 1 -3 1 and 33-143, polynucleotides described in Tables Va and Table 
Vb, polynucleotides described in Table VI, and polynucleotide described herein as encoding 
polypeptides having a biological activity. It will be appreciated that the cDNA codes of SEQ ID 

10 Nos: 1-241 can be represented in the traditional single character format (See the inside back cover 
of Styer, 1995) or in any other format which records the identity of the nucleotides in a sequence. 

As used herein the term " polypeptide codes of SEP ID Nos: 242-482 " encompasses the 
polypeptide sequences of SEQ ID Nos: 242-482, the signal peptides included in SEQ ID Nos: 242- 
272 and 274-384, the mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, the full- 

1 5 length, signal peptides and mature polypeptide sequences encoded by the clone inserts of the 
deposited clone pool, polypeptide sequences homologous thereto, or fragments of any of the 
preceding sequences. Homologous polypeptide sequences refer to a polypeptide sequence having at 
least 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, 75% identity to one of the polypeptide 
sequences of SEQ ID Nos: 242-482, the signal peptides included in SEQ ID Nos: 242-272 and 274- 

20 384, the mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, the full-length, signal 
peptides and mature polypeptide sequences encoded by the clone inserts of the deposited clone 
pool. Identity may be determined using any of the computer programs and parameters described 
herein, including FASTA with the default parameters or with any modified parameters. The 
homologous sequences may be obtained using any of the procedures described herein or may result 

25 from the correction of a sequencing error as described above. The polypeptide fragments comprise 
at least 5, 6, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, 200, 250, 300, 350, 400, 450 or 
500 consecutive amino acids of the polypeptides of SEQ ID Nos: 242-482. Preferably, the 
fragments include polypeptides encoded by the signal peptides included in SEQ ID Nos: 242-272 
and 274-384, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, polynucleotides 

30 described in Tables Va and in Table Vb, domains described in Table VI, epitopes described in 

Table VII, polypeptides described herein as having a biological activity, or fragments comprising at 
least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300 or 400 consecutive amino acids of the 
signal peptides included in SEQ ID Nos: 242-272 and 274-384, mature polypeptides included in 
SEQ ID Nos: 242-272 and 274-384, the polypeptides encoded by the polynucleotides described in 

35 Tables Va and in Table Vb, domains of Table VI, epitopes of Table VII or of polypeptides 

described herein as having a biological activity. It will be appreciated that the polypeptide codes of 
the SEQ ED Nos: 242-482 can be represented in the traditional single character format or three letter 
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format (See the inside back cover of Stryer, 1 995) or in any other format which relates the identity 
of the polypeptides in a sequence. 

It will be appreciated by those skilled in the art that the nucleic acid codes of the invention 
and polypeptide codes of the invention can be stored, recorded, and manipulated on any medium 
5 which can be read and accessed by a computer. As used herein, the words "recorded" and "stored" 
refer to a process for storing information on a computer medium. A skilled artisan can readily 
adopt any of the presently known methods for recording information on a computer readable 
medium to generate manufactures comprising one or more of the nucleic acid codes of the 
invention, or one or more of the polypeptide codes of the invention. Another aspect of the present 

10 invention is a computer readable medium having recorded thereon at least 2, 5, 10, 15, 20, 25, 30, 
50, 75, 100, 150 or 200 nucleic acid codes of the invention. Another aspect of the present invention 
is a computer readable medium having recorded thereon at least 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 
1 50 or 200 polypeptide codes of the invention. 

Computer readable media include magnetically readable media, optically readable media, 

15 electronically readable media and magnetic/optical media. For example, the computer readable 
media may be a hard disk, a floppy disk, a magnetic tape, CD-ROM, Digital Versatile Disk (DVD), 
Random Access Memory (RAM), or Read Only Memory (ROM) as well as other types of other 
media known to those skilled in the art. 

Embodiments of the present invention include systems, particularly computer systems 

20 which store and manipulate the sequence information described herein. One example of a computer 
system 100 is illustrated in block diagram form in Figure 2. As used herein, "a computer system" 
refers to the hardware components, software components, and data storage components used to 
analyze the nucleotide sequences of the nucleic acid codes of the invention or the amino acid 
sequences of the polypeptide codes of the invention. In one embodiment, the computer system 100 

25 is a Sun Enterprise 1000 server (Sun Microsystems, Palo Alto, CA). The computer system 100 
preferably includes a processor for processing, accessing and manipulating the sequence data. The 
processor 105 can be any well-known type of central processing unit, such as the Pentium III from 
Intel Corporation, or similar processor from Sun, Motorola, Compaq or International Business 
Machines. 

30 Preferably, the computer system 100 is a general purpose system that comprises the 

processor 105 and one or more internal data storage components 1 10 for storing data, and one or 
more data retrieving devices for retrieving the data stored on the data storage components. A 
skilled artisan can readily appreciate that any one of the currently available computer systems are 
suitable. 

35 In one particular embodiment, the computer system 100 includes a processor 105 connected 

to a bus which is connected to a main memory 115 (preferably implemented as RAM) and one or 
more internal data storage devices 1 1 0, such as a hard drive and/or other computer readable media 
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having data recorded thereon. In some embodiments, the computer system 100 further includes one 
or more data retrieving device 1 1 8 for reading the data stored on the internal data storage devices 
110. 

The data retrieving device 118 may represent, for example, a floppy disk drive, a compact 
5 disk drive, a magnetic tape drive, etc. In some embodiments, the internal data storage device 1 10 is 
a removable computer readable medium such as a floppy disk, a compact disk, a magnetic tape, etc. 
containing control logic and/or data recorded thereon. The computer system 100 may 
advantageously include or be programmed by appropriate software for reading the control logic 
and/or the data from the data storage component once inserted in the data retrieving device. 
10 The computer system 100 includes a display 120 which is used to display output to a 

computer user. It should also be noted that the computer system 100 can be linked to other 
computer systems 125a-c in a network or wide area network to provide centralized access to the 
computer system 100. 

Software for accessing and processing the nucleotide sequences of the nucleic acid codes of 

15 the invention or the amino acid sequences of the polypeptide codes of the invention (such as search 
tools, compare tools, and modeling tools etc.) may reside in main memory 115 during execution. 

In some embodiments, the computer system 100 may further comprise a sequence comparer 
for comparing the above-described nucleic acid codes of the invention or the polypeptide codes of 
the invention stored on a computer readable medium to reference nucleotide or polypeptide 

20 sequences stored on a computer readable medium. A "sequence comparer" refers to one or more 
programs which are implemented on the computer system 100 to compare a nucleotide or 
polypeptide sequence with other nucleotide or polypeptide sequences and/or compounds including 
but not limited to peptides, peptidomimetics, and chemicals stored within the data storage means. 
For example, the sequence comparer may compare the nucleotide sequences of nucleic acid codes 

25 of the invention or the amino acid sequences of the polypeptide codes of the invention stored on a 
computer readable medium to reference sequences stored on a computer readable medium to 
identify homologies, motifs implicated in biological function, or structural motifs. The various 
sequence comparer programs identified elsewhere in this patent specification are particularly 
contemplated for use in this aspect of the invention. 

30 Figure 3 is a flow diagram illustrating one embodiment of a process 200 for comparing a 

new nucleotide or protein sequence with a database of sequences in order to determine the 
homology levels between the new sequence and the sequences in the database. The database of 
sequences can be a private database stored within the computer system 100, or a public database 
such as GENBANK, PIR OR SWISSPROT that is available through the Internet. 

35 The process 200 begins at a start state 201 and then moves to a state 202 wherein the new 

sequence to be compared is stored to a memory in a computer system 100. As discussed above, the 
memory could be any type of memory, including RAM or an internal storage device. 
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The process 200 then moves to a state 204 wherein a database of sequences is opened for 
analysis and comparison. The process 200 then moves to a state 206 wherein the first sequence 
stored in the database is read into a memory on the computer. A comparison is then performed at a 
state 210 to determine if the first sequence is the same as the second sequence. It is important to 
5 note that this step is not limited to performing an exact comparison between the new sequence and 
the first sequence in the database. Well-known methods are known to those of skill in the art for 
comparing two nucleotide or protein sequences, even if they are not identical. For example, gaps 
can be introduced into one sequence in order to raise the homology level between the two tested 
sequences. The parameters that control whether gaps or other features are introduced into a 

10 sequence during comparison are normally entered by the user of the computer system. 

Once a comparison of the two sequences has been performed at the state 210, a 
determination is made at a decision state 210 whether the two sequences are the same. Of course, 
the term "same" is not limited to sequences that are absolutely identical. Sequences that are within 
the homology parameters entered by the user will be marked as "same" in the process 200. 

15 If a determination is made that the two sequences are the same, the process 200 moves to a 

state 214 wherein the name of the sequence from the database is displayed to the user. This state 
notifies the user that the sequence with the displayed name fulfills the homology constraints that 
were entered. Once the name of the stored sequence is displayed to the user, the process 200 moves 
to a decision state 2 1 8 wherein a determination is made whether more sequences exist in the 

20 database. If no more sequences exist in the database, then the process 200 terminates at an end state 
220. However, if more sequences do exist in the database, then the process 200 moves to a state 
224 wherein a pointer is moved to the next sequence in the database so that it can be compared to 
the new sequence. In this manner, the new sequence is aligned and compared with every sequence 
in the database. 

25 It should be noted that if a determination had been made at the decision state 212 that the 

sequences were not homologous, then the process 200 would move immediately to the decision 
state 218 in order to determine if any other sequences were available in the database for 
comparison. 

Accordingly, one aspect of the present invention is a computer system comprising a 
30 processor, a data storage device having stored thereon a nucleic acid code of the invention or a 
polypeptide code of the invention,. In some embodiments the computer system further comprises a 
data storage device having retrievably stored thereon reference nucleotide sequences or polypeptide 
sequences to be compared to the nucleic acid code of the invention or polypeptide code of the 
invention and a sequence comparer for conducting the comparison. For example, the sequence 
35 comparer may comprise a computer program which indicates polymorphisms. In other aspects of 
the computer system, the system further comprises an identifier which identifies features in said 
sequence. The sequence comparer may indicate a homology level between the sequences compared 
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or identify motifs implicated in biological function and structural motifs in the nucleic acid code of 
the invention and polypeptide codes of the invention or it may identify structural motifs in 
sequences which are compared to these nucleic acid codes and polypeptide codes. In some 
embodiments, the data storage device may have stored thereon the sequences of at least 2, 5, 10, 1 5, 
5 20, 25, 30, 50, 75, 100, 150 or 200 of the nucleic acid codes of the invention or polypeptide codes 
of the invention. 

Another aspect of the present invention is a method for determining the level of homology 
between a nucleic acid code of the invention and a reference nucleotide sequence, comprising the 
steps of reading the nucleic acid code and the reference nucleotide sequence through the use of a 

10 computer program which determines homology levels and determining homology between the 

nucleic acid code and the reference nucleotide sequence with the computer program. The computer 
program may be any of a number of computer programs for determining homology levels, including 
those specifically enumerated herein, including BLAST2N with the default parameters or with any 
modified parameters. The method may be implemented using the computer systems described 

15 above. The method may also be performed by reading 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 150 or 
200 of the above described nucleic acid codes of the invention through the use of the computer 
program and determining homology between the nucleic acid codes and reference nucleotide 
sequences. 

Figure 4 is a flow diagram illustrating one embodiment of a process 250 in a computer for 
20 determining whether two sequences are homologous. The process 250 begins at a start state 252 and 
then moves to a state 254 wherein a first sequence to be compared is stored to a memory. The 
second sequence to be compared is then stored to a memory at a state 256. The process 250 then 
moves to a state 260 wherein the first character in the first sequence is read and then to a state 262 
wherein the first character of the second sequence is read. It should be understood that if the 
25 sequence is a nucleotide sequence, then the character would normally be either A, T, C, G or U. If 
the sequence is a protein sequence, then it should be in the single letter amino acid code so that the 
first and sequence sequences can be easily compared. 

A determination is then made at a decision state 264 whether the two characters are the 
same. If they are the same, then the process 250 moves to a state 268 wherein the next characters in 
30 the first and second sequences are read. A determination is then made whether the next characters 
are the same. If they are, then the process 250 continues this loop until two characters are not the 
same. If a determination is made that the next two characters are not the same, the process 250 
moves to a decision state 274 to determine whether there are any more characters either sequence to 
read. 

35 If there aren't any more characters to read, then the process 250 moves to a state 276 

wherein the level of homology between the first and second sequences is displayed to the user. The 
level of homology is determined by calculating the proportion of characters between the sequences 
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that were the same out of the total number of sequences in the first sequence. Thus, if every 
character in a first 100 nucleotide sequence aligned with a every character in a second sequence, the 
homology level would be 100%. 

Alternatively, the computer program may be a computer program which compares the 
5 nucleotide sequences of the nucleic acid codes of the present invention, to reference nucleotide 
sequences in order to determine whether the nucleic acid code of the invention differs from a 
reference nucleic acid sequence at one or more positions. Optionally such a program records the 
length and identity of inserted, deleted or substituted nucleotides with respect to the sequence of 
either the reference polynucleotide or the nucleic acid code of the invention. In one embodiment, 

1 0 the computer program may be a program which determines whether the nucleotide sequences of the 
nucleic acid codes of the invention contain one or more single nucleotide polymorphisms (SNP) 
with respect to a reference nucleotide sequence. These single nucleotide polymorphisms may each 
comprise a single base substitution, insertion, or deletion. 

Another embodiment of the present invention is a method for comparing a first sequence to 

15 a reference sequence wherein the first sequence is selected from the group consisting of a cDNA 
code of SEQID NOs. 1-297 and a polypeptide code of SEQ ID NOs. 298-594 comprising the steps 
of reading the first sequence and the reference sequence through use of a computer program which 
compares sequences and determining differences between the first sequence and the reference 
sequence with the computer program. In some aspects of this embodiment, said step of determining 

20 differences between the first sequence and the reference sequence comprises identifying 
polymorphisms. 

Another aspect of the present invention is a method for determining the level of homology 
between a polypeptide code of the invention and a reference polypeptide sequence, comprising the 
steps of reading the polypeptide code of the invention and the reference polypeptide sequence 

25 through use of a computer program which determines homology levels and determining homology 
between the polypeptide code and the reference polypeptide sequence using the computer program. 

Accordingly, another aspect of the present invention is a method for determining whether a 
nucleic acid code of the invention differs at one or more nucleotides from a reference nucleotide 
sequence comprising the steps of reading the nucleic acid code and the reference nucleotide 

30 sequence through use of a computer program which identifies differences between nucleic acid 
sequences and identifying differences between the nucleic acid code and the reference nucleotide 
sequence with the computer program. In some embodiments, the computer program is a program 
which identifies single nucleotide polymorphisms The method may be implemented by the 
computer systems described above and the method illustrated in Figure 4. The method may also be 

35 performed by reading at least 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 150 or 200 of the nucleic acid 
codes of the invention and the reference nucleotide sequences through the use of the computer 
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program and identifying differences between the nucleic acid codes and the reference nucleotide 
sequences with the computer program. 

Thus, another embodiment of the present invention is a method for comparing a first 
sequence to a reference sequence wherein the first sequence is selected from the group consisting of 
5 the nucleic acid codes of the present invention or the polypeptide codes of the present invention 
comprising the steps of reading the first sequence and the reference sequence through use of a 
computer program which compares sequences and determining differences between the first 
sequence and the reference sequence with the computer program. In some aspects of this 
embodiment, said step of determining differences between the first sequence and the reference 

10 sequence comprises identifying polymorphisms. 

Another aspect of the present invention is a method for determining the level of identity 
between a first sequence and a reference sequence, wherein the first sequence is selected from the 
group consisting of the nucleic acid codes of the present invention or the polypeptide codes of the 
present invention, comprising the steps of reading the first sequence and the reference sequence 

15 through the use of a computer program which determines identity levels and determining identity 
between the first sequence and the reference sequence with the computer program. 

In other embodiments the computer based system may further comprise an identifier for 
identifying features within the nucleotide sequences of the nucleic acid codes of the invention or the 
amino acid sequences of the polypeptide codes of the invention. An "identifier" refers to one or 

20 more programs which identifies certain features within the above-described nucleotide sequences of 
the nucleic acid codes of the invention or the amino acid sequences of the polypeptide codes of the 
invention. In one embodiment, the identifier may comprise a program which identifies an open 
reading frame in the cDNAs codes of the invention. 

Another embodiment of the present invention is a method for identifying a feature in a 

25 sequence selected from the group consisting of the nucleic acid codes of the invention or the amino 
acid sequences of the polypeptide codes of the invention comprising the steps of reading the 
sequence through the use of a computer program which identifies features in sequences and 
identifying features in the sequence with said computer program. In one aspect of this embodiment, 
the computer program comprises a computer program which identifies open reading frames. In a 

30 further embodiment, the computer program comprises a program that identifies linear or structural 
motifs in a polypeptide sequence. 

Figure 5 is a flow diagram illustrating one embodiment of an identifier process 300 for 
detecting the presence of a feature in a sequence. The process 300 begins at a start state 302 and 
then moves to a state 304 wherein a first sequence that is to be checked for features is stored to a 

35 memory 1 15 in the computer system 100. The process 300 then moves to a state 306 wherein a 
database of sequence features is opened. Such a database would include a list of each feature's 
attributes along with the name of the feature. For example, a feature name could be "Initiation 
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Codon" and the attribute would be "ATG". Another example would be the feature name 
"TAATAA Box" and the feature attribute would be "TAATAA". An example of such a database is 
produced by the University of Wisconsin Genetics Computer Group (www.gcg.com). 

Once the database of features is opened at the state 306, the process 300 moves to a state 
5 308 wherein the first feature is read from the database. A comparison of the attribute of the first 
feature with the first sequence is then made at a state 310. A determination is then made at a 
decision state 316 whether the attribute of the feature was found in the first sequence. If the 
attribute was found, then the process 300 moves to a state 3 1 8 wherein the name of the found 
feature is displayed to the user. 

10 The process 300 then moves to a decision state 320 wherein a determination is made 

whether move features exist in the database. If no more features do exist, then the process 300 
terminates at an end state 324. However, if more features do exist in the database, then the process 
300 reads the next sequence feature at a state 326 and loops back to the state 310 wherein the 
attribute of the next feature is compared against the first sequence. 

1 5 It should be noted, that if the feature attribute is not found in the first sequence at the 

decision state 316, the process 300 moves directly to the decision state 320 in order to determine if 
any more features exist in the database. 

In another embodiment, the identifier may comprise a molecular modeling program which 
determines the 3-dimensional structure of the polypeptides codes of the invention. Such programs 

20 may use any methods known to those skilled in the art including methods based on homology- 
modeling, fold recognition and ab initio methods as described in Sternberg et al., 1999, which 
disclosure is hereby incorporated by reference in its entirety. In some embodiments, the molecular 
modeling program identifies target sequences that are most compatible with profiles representing 
the structural environments of the residues in known three-dimensional protein structures. (See, 

25 e.g., Eisenberg et al., U.S. Patent No. 5,436,850 issued July 25, 1995, which disclosure is hereby 
incorporated by reference in its entirety). In another technique, the known three-dimensional 
structures of proteins in a given family are superimposed to define the structurally conserved 
regions in that family. This protein modeling technique also uses the known three-dimensional 
structure of a homologous protein to approximate the structure of the polypeptide codes of the 

30 invention. (See e.g., Srinivasan, et al., U.S. Patent No. 5,557,535 issued September 17, 1996, which 
disclosure is hereby incorporated by reference in its entirety). Conventional homology modeling 
techniques have been used routinely to build models of proteases and antibodies. (Sowdhamini et 
al. 9 (1997)). Comparative approaches can also be used to develop three-dimensional protein models 
when the protein of interest has poor sequence identity to template proteins. In some cases, proteins 

35 fold into similar three-dimensional structures despite having very weak sequence identities. For 
example, the three-dimensional structures of a number of helical cytokines fold in similar three- 
dimensional topology in spite of weak sequence homology. 
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The recent development of threading methods now enables the identification of likely 
folding patterns in a number of situations where the structural relatedness between target and 
template(s) is not detectable at the sequence level. Hybrid methods, in which fold recognition is 
performed using Multiple Sequence Threading (MST), structural equivalencies are deduced from 
5 the threading output using a distance geometry program DRAGON to construct a low resolution 
model, and a full-atom representation is constructed using a molecular modeling package such as 
QUANTA. 

According to this 3 -step approach, candidate templates are first identified by using the 
novel fold recognition algorithm MST, which is capable of performing simultaneous threading of 

10 multiple aligned sequences onto one or more 3-D structures. In a second step, the structural 

equivalencies obtained from the MST output are converted into interresidue distance restraints and 
fed into the distance geometry program DRAGON, together with auxiliary information obtained 
from secondary structure predictions. The program combines the restraints in an unbiased manner 
and rapidly generates a large number of low resolution model confirmations. In a third step, these 

15 low resolution model confirmations are converted into full-atom models and subjected to energy 
minimization using the molecular modeling package QUANTA. (See e.g., Aszodi et al. 9 (1997)). 

The results of the molecular modeling analysis may then be used in rational drug design 
techniques to identify agents which modulate the activity of the polypeptide codes of the invention. 
Accordingly, another aspect of the present invention is a method of identifying a feature 

20 within the nucleic acid codes of the invention or the polypeptide codes of the invention comprising 
reading the nucleic acid code(s) or the polypeptide code(s) through the use of a computer program 
which identifies features therein and identifying features within the nucleic acid code(s) or 
polypeptide code(s) with the computer program. In one embodiment, computer program comprises 
a computer program which identifies open reading frames. In a further embodiment, the computer 

25 program identifies linear or structural motifs in a polypeptide sequence. In another embodiment, 
the computer program comprises a molecular modeling program. The method may be performed by 
reading a single sequence or at least 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 150 or 200 of the nucleic 
acid codes of the invention or the polypeptide codes of the invention through the use of the 
computer program and identifying features within the nucleic acid codes or polypeptide codes with 

30 the computer program. 

The nucleic acid codes of the invention or the polypeptide codes of the invention may be 
stored and manipulated in a variety of data processor programs in a variety of formats. For 
example, they may be stored as text in a word processing file, such as MicrosoftWORD or 
WORDPERFECT or as an ASCII file in a variety of database programs familiar to those of skill in 

35 the art, such as DB2, SYBASE, or ORACLE. In addition, many computer programs and databases 
may be used as sequence comparers, identifiers, or sources of reference nucleotide or polypeptide 
sequences to be compared to the nucleic acid codes of the invention or the polypeptide codes of the 
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invention. The following list is intended not to limit the invention but to provide guidance to 
programs and databases which are useful with the nucleic acid codes of the invention or the 
polypeptide codes of the invention. The programs and databases which may be used include, but 
are not limited to: MacPattern (EMBL), DiscoveryBase (Molecular Applications Group), GeneMine 
5 (Molecular Applications Group), Look (Molecular Applications Group), MacLook (Molecular 
Applications Group), BLAST and BLAST2 (NCBI), BLASTN and BLASTX (Altschul et al, 1990), 
FASTA (Pearson and Lipman, 1988), FASTDB (Brutlag et aL, 1990), Catalyst (Molecular 
Simulations Inc.), Catalyst/SHAPE (Molecular Simulations Inc.), Cerius2.DB Access (Molecular 
Simulations Inc.), HypoGen (Molecular Simulations Inc.), Insight II, (Molecular Simulations Inc.), 

10 Discover (Molecular Simulations Inc.), CHARMm (Molecular Simulations Inc.), Felix (Molecular 
Simulations Inc.), DelPhi, (Molecular Simulations Inc.), QuanteMM, (Molecular Simulations Inc.), 
Homology (Molecular Simulations Inc.), Modeler (Molecular Simulations Inc.), ISIS (Molecular 
Simulations Inc.), Quanta/Protein Design (Molecular Simulations Inc.), WebLab (Molecular 
Simulations Inc.), WebLab Diversity Explorer (Molecular Simulations Inc.), Gene Explorer 

15 (Molecular Simulations Inc.), SeqFold (Molecular Simulations Inc.), the EMBL/Swissprotein 

database, the MDL Available Chemicals Directory database, the MDL Drug Data Report data base, 
the Comprehensive Medicinal Chemistry database, Derwents's World Drug Index database, the 
BioByteMasterFile database, the Genbank database, and the Genseqn database. Many other 
programs and data bases would be apparent to one of skill in the art given the present disclosure. 

20 Motifs which may be detected using the above programs include sequences encoding 

leucine zippers, helix-turn-helix motifs, glycosylation sites, ubiquitination sites, alpha helices, and 
beta sheets, signal sequences encoding signal peptides which direct the secretion of the encoded 
proteins, sequences implicated in transcription regulation such as homeoboxes, acidic stretches, 
enzymatic active sites, substrate binding sites, and enzymatic cleavage sites. 

25 Conclusion 

As discussed above, the GENSET polynucleotides and polypeptides of the present 
invention or fragments thereof can be used for various purposes. The polynucleotides can be used 
to express recombinant protein for analysis, characterization or therapeutic use; as markers for 
tissues in which the corresponding protein is preferentially expressed (either constitutively or at a 

30 particular stage of tissue differentiation or development or in disease states); as molecular weight 
markers on Southern gels; as chromosome markers or tags (when labeled) to identify chromosomes 
or to map related gene positions; as a reagent (including a labeled reagent) in assays designed to 
quantitatively determine levels of GENSET expression in biological samples; to compare with 
endogenous DNA sequences in patients to identify potential genetic disorders; as probes to 

35 hybridize and thus discover novel, related DNA sequences; as a source of information to derive 
PCR primers for genetic fingerprinting; for selecting and making oligomers for attachment to a 
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"gene chip" or other support, including for examination for expression patterns; to raise anti-protein 
antibodies using DNA immunization techniques; and as an antigen to raise anti-DNA antibodies or 
elicit another immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand interaction), the 
5 polynucleotide can also be used in interaction trap assays (such as, for example, that described in 
Gyuris et aL, (1993) to identify polynucleotides encoding the other protein with which binding 
occurs or to identify inhibitors of the binding interaction. 

The proteins or polypeptides provided by the present invention can similarly be used in 
assays to determine biological activity, including in a panel of multiple proteins for high-throughput 

10 screening; to raise antibodies or to elicit another immune response; as a reagent (including the 

labeled reagent) in assays designed to quantitatively determine levels of the protein (or its receptor) 
in biological fluids; as markers for tissues in which the corresponding protein is preferentially 
expressed (either constitutively or at a particular stage of tissue differentiation or development or in 
a disease state); and, of course, to isolate correlative receptors or ligands. Where the protein binds 

15 or potentially binds to another protein (such as, for example, in a receptor-ligand interaction), the 
protein can be used to identify the other protein with which binding occurs or to identify inhibitors 
of the binding interaction. Proteins involved in these binding interactions can also be used to screen 
for peptide or small molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or kit 

20 format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning; A Laboratory 
Manual", 2d ed., Cole Spring Harbor Laboratory Press, Sambrook, J., E.F. Fritsch and T. Maniatis 
eds., 1989, and "Methods in Enzymology; Guide to Molecular Cloning Techniques", Academic 

25 Press, Berger and Kimmel eds., 1987, which disclosures are hereby incorporated by reference in 
their entireties. 

Polynucleotides and proteins of the present invention can also be used as nutritional sources 
or supplements. Such uses include without limitation use as a protein or amino acid supplement, 
use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In such cases 

30 the protein or polynucleotide of the invention can be added to the feed of a particular organism or 
can be administered as a separate solid or liquid preparation, such as in the form of powder, pills, 
solutions, suspensions or capsules. In the case of microorganisms, the protein or polynucleotide of 
the invention can be added to the medium in or on which the microorganism is cultured. 

Although this invention has been described in terms of certain preferred embodiments, other 

35 embodiments which will be apparent to those of ordinary skill in the art in view of the disclosure 
herein are also within the scope of this invention. Accordingly, the scope of the invention is 
intended to be defined only by reference to the appended claims. 
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Preparation of Antibody Compositions to GENSET proteins 

Substantially pure protein or polypeptide is isolated from transfected or transformed cells 
containing an expression vector encoding a GENSET protein or a portion thereof. The 
5 concentration of protein in the final preparation is adjusted, for example, by concentration on an 
Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows: 

A. Monoclonal Antibody Production by Hvbridoma Fusion 

Monoclonal antibody to epitopes in the GENSET protein or a portion thereof can be 
10 prepared from murine hybridomas according to the classical method of Kohler and Milstein, (1975) 
or derivative methods thereof. Also see Harlow and Lane. (1988).. 

Briefly, a mouse is repetitively inoculated with a few micrograms of the GENSET protein 
or a portion thereof over a period of a few weeks. The mouse is then sacrificed, and the antibody 
producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol 
15 with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on 

selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and 
aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is 
continued. Antibody-producing clones are identified by detection of antibody in the supernatant 
fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall, 
20 (1980), which disclosure is hereby incorporated by reference in its entirety, and derivative methods 
thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested 
for use. Detailed procedures for monoclonal antibody production are described in Davis, et aL 
(1986) Section 21-2. 

B. Polyclonal Antibody Production by Immunization 

25 Polyclonal antiserum containing antibodies to heterogeneous epitopes in the GENSET 

protein or a portion thereof can be prepared by immunizing suitable non-human animal with the 
GENSET protein or a portion thereof, which can be unmodified or modified to enhance 
immunogenicity. A suitable non-human animal is preferably a non-human mammal is selected, 
usually a mouse, rat, rabbit, goat, or horse. Alternatively, a crude preparation which has been 

30 enriched for GENSET concentration can be used to generate antibodies. Such proteins, fragments 

or preparations are introduced into the non-human mammal in the presence of an appropriate 

adjuvant (e.g. aluminum hydroxide, RIBI, etc.) which is known in the art. In addition the protein, 

fragment or preparation can be pretreated with an agent which will increase antigenicity, such 

agents are known in the art and include, for example, methylated bovine serum albumin (mBSA), 

35 bovine serum albumin (BSA), Hepatitis B surface antigen, and keyhole limpet hemocyanin (KLH). 
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Serum from the immunized animal is collected, treated and tested according to known procedures. 
If the serum contains polyclonal antibodies to undesired epitopes, the polyclonal antibodies can be 
purified by immunoaffinity chromatography. 

Effective polyclonal antibody production is affected by many factors related both to the 
5 antigen and the host species. Also, host animals vary in response to site of inoculations and dose, 
with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng 
level) of antigen administered at multiple intradermal sites appears to be most reliable. Techniques 
for producing and processing polyclonal antisera are known in the art. An effective immunization 
protocol for rabbits can be found in Vaitukaitis et al. (1971), which disclosure is hereby 

1 0 incorporated by reference in its entirety. 

Booster injections can be given at regular intervals, and antiserum harvested when antibody 
titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar 
against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony et al., 
(1973), which disclosure is hereby incorporated by reference in its entirety. Plateau concentration 

15 of antibody is usually in the range of 0. 1 to 0.2 mg/ml of serum (about 12 uM). Affinity of the 
antisera for the antigen is determined by preparing competitive binding curves, as described, for 
example, by Fisher (1980), which disclosure is hereby incorporated by reference in its entirety. 

Antibody preparations prepared according to either the monoclonal or the polyclonal 
protocol are useful in quantitative immunoassays which determine concentrations of antigen- 

20 bearing substances in biological samples; they are also used semi-quantitatively or qualitatively to 
identify the presence of antigen in a biological sample. The antibodies may also be used in 
therapeutic compositions for killing cells expressing the protein or reducing the levels of the protein 
in the body. 

Biological assays 

25 Assaying GENSET Secreted Proteins to Determine Whether they Bind to the Cell Surface 

The secreted proteins encoded by the GENSET cDNAs, preferably the proteins of SEQ ID 
NOs: 242-272 and 274-384, or fragments thereof are cloned into expression vectors. The proteins 
are purified by size, charge, immunochromatography or other techniques familiar to those skilled in 
the art. Following purification, the proteins are labeled using techniques known to those skilled in 

30 the art. The labeled proteins are incubated with cells or cell lines derived from a variety of organs 
or tissues to allow the proteins to bind to any receptor present on the cell surface. Following the 
incubation, the cells are washed to remove non-specifically bound protein. The labeled proteins are 
detected by autoradiography. Alternatively, unlabeled proteins may be incubated with the cells and 
detected with antibodies having a detectable label, such as a fluorescent molecule, attached thereto. 

35 Specificity of cell surface binding may be analyzed by conducting a competition analysis in 

which various amounts of unlabeled protein are incubated along with the labeled protein. The 
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amount of labeled protein bound to the cell surface decreases as the amount of competitive 
unlabeled protein increases. As a control, various amounts of an unlabeled protein unrelated to the 
labeled protein is included in some binding reactions. The amount of labeled protein bound to the 
cell surface does not decrease in binding reactions containing increasing amounts of unrelated 
5 unlabeled protein, indicating that the protein encoded by the cDNA binds specifically to the cell 
surface. 

As discussed herein, secreted proteins have been shown to have a number of important 
physiological effects and, consequently, represent a valuable therapeutic resource. The secreted 
proteins encoded by the cDNAs or fragments thereof made using any of the methods described 
10 therein may be evaluated to determine their physiological activities as described below. 

Assaying GENSET proteins or Fragments Thereof for Cytokine, Cell Proliferation or Cell 
Differentiation Activity 

Secreted proteins may act as cytokines or may affect cellular proliferation or differentiation. 
Many protein factors discovered to date, including all known cytokines, have exhibited activity in 

1 5 one or more factor dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of a protein of the present invention is evidenced by 
any one of a number of routine factor dependent cell proliferation assays for cell lines including, 
without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, MC9/G, M+ (preB M+), 2E8, RB5, 
DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7c and CMK. The proteins encoded by the cDNAs of 

20 the invention or fragments thereof may be evaluated for their ability to regulate T cell or thymocyte 
proliferation in assays such as those described above or in the following references, which are 
incorporated herein by reference: Current Protocols in Immunology , Ed. by J.E. Coligan et aL, 
Greene Publishing Associates and Wiley-Interscience; Takai et al. J. Immunol. 137:3494-3500, 
1986. Bertagnolli et al. J. Immunol. 145:1706-1712, 1990. Bertagnolli et al, Cellular Immunology 

25 133:327-341, 1991. Bertagnolli, et al. J. Immunol. 149:3778-3783, 1992; Bowman et aL, J. 
Immunol. 152:1756-1761, 1994. 

In addition, numerous assays for cytokine production and/or the proliferation of spleen 
cells, lymph node cells and thymocytes are known. These include the techniques disclosed in 
Current Protocols in Immunology . J.E. Coligan et al. Eds., Vol 1 pp. 3.12.1-3.12.14 John Wiley and 

30 Sons, Toronto. 1994; and Schreiber, R.D. Current Protocols in Immunology ., supra Vol 1 pp. 
6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

The proteins encoded by the cDNAs of the invention may also be assayed for the ability to 
regulate the proliferation and differentiation of hematopoietic or lymphopoietic cells. Many assays 
for such activity are familiar to those skilled in the art, including the assays in the following 

35 references, which are incorporated herein by reference: Bottomry, K., Davis, L.S. and Lipsky, P.E., 
Measurement of Human and Murine Interleukin 2 and Interleukin 4, Current Protocols in 
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Immunology ., J.E. Coligan et al Eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVriesetal,J.Exp.Med. 173:1205-1211, 1991; Moreau et al, Nature 36:690-692, 1988; 
Greenberger et al, Proc. Natl Acad. Sci. U.S.A. 80:2931-2938, 1983; Nordan, R., Measurement of 
Mouse and Human Interleukin 6 Current Protocols in Immunology. J.E. Coligan et al. Eds. Vol 1 
5 pp. 6.6. 1-6.6.5, John Wiley and Sons, Toronto. 1991 ; Smith et al., Proc. Natl. Acad. Sci. U.S.A. 
83:1857-1861, 1986; Bennett, F., Giannotti, J., Clark, S.C. and Turner, K.J., Measurement of 
Human Interleukin 1 1 Current Protocols in Immunology . J.E. Coligan et al. Eds. Vol 1 pp. 6.15.1 
John Wiley and Sons, Toronto. 1991; Ciarletta, A., Giannotti, J., Clark, S.C. and Turner, K.J., 
Measurement of Mouse and Human Interleukin 9 Current Protocols in Immunology . J.E. Coligan et 

10 al., Eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

The proteins encoded by the cDNAs of the invention may also be assayed for their ability to 
regulate T-cell responses to antigens. Many assays for such activity are familiar to those skilled in 
the art, including the assays described in the following references, which are incorporated herein by 
reference: Chapter 3 (In vitro Assays for Mouse Lymphocyte Function), Chapter 6 (Cytokines and 

15 Their Cellular Receptors) and Chapter 7, (Immunologic Studies in Humans) in Current Protocols in 
Immunology , J.E. Coligan et al. Eds. Greene Publishing Associates and Wiley-Interscience; 
Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al, Eur. J. 
Immun. 11:405-411, 1981; Takai et al. , J. Immunol. 137:3494-3500, 1986; Takai et al, J. Immunol 
140:508-512, 1988. 

20 Those proteins which exhibit cytokine, cell proliferation, or cell differentiation activity may 

then be formulated as pharmaceuticals and used to treat clinical conditions in which induction of 
cell proliferation or differentiation is beneficial. Alternatively, as described in more detail below, 
genes encoding these proteins or nucleic acids regulating the expression of these proteins may be 
introduced into appropriate host cells to increase or decrease the expression of the proteins as 

25 desired. 

Assaying GENSET proteins or Fragments Thereof for Activity as Immune System Regulators 

The proteins encoded by the cDNAs of the invention may also be evaluated for their effects 
as immune regulators. For example, the proteins may be evaluated for their activity to influence 
thymocyte or splenocyte cytotoxicity. Numerous assays for such activity are familiar to those 

30 skilled in the art including the assays described in the following references, which are incorporated 
herein by reference: Chapter 3 (In vitro Assays for Mouse Lymphocyte Function 3.1-3.19) and 
Chapter 7 (Immunologic studies in Humans) in Current Protocols in Immunology , J.E. Coligan et 
al. Eds, Greene Publishing Associates and Wiley-Interscience; Herrmann et al, Proc. Natl Acad. 
Sci. USA 78:2488-2492, 1981; Herrmann et al, J. Immunol 128:1968-1974, 1982; Hand* et al, J. 

35 Immunol 135:1564-1572, 1985; Takai et al, J. Immunol 137:3494-3500, 1986; Takai et al, J. 
Immunol 140:508-512, 1988; Herrmann etal, Proc. Natl. Acad. Sci. USA 78:2488-2492, 1981; 
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Herrmann et al,J. Immunol. 128:1968-1974, 1982; Handa et al, J. Immunol. 135:1564-1572, 1985; 
Takai et al 9 J. Immunol 137:3494-3500, 1986; Bowman et al,J. Virology 61:1992-1998; Takai et 
al., J. Immunol. 140:508-512, 1988; Bertagnolli etal, Cellular Immunology 133:327-341, 1991; 
Brown et al,J. Immunol. 153:3079-3092, 1994. 
5 The proteins encoded by the cDNAs of the invention may also be evaluated for their effects 

on T-cell dependent immunoglobulin responses and isotype switching. Numerous assays for such 
activity are familiar to those skilled in the art, including the assays disclosed in the following 
references, which are incorporated herein by reference: Maliszewski, J. Immunol. 144:3028-3033, 
1990; Mond, JJ. and Brunswick, M Assays for B Cell Function: In vitro Antibody Production, Vol 
10 1 pp. 3.8.1-3.8.16 in Current Protocols in Immunology. J.E. Coligan et al Eds., John Wiley and 
Sons, Toronto. 1994. 

The proteins encoded by the cDNAs of the invention may also be evaluated for their effect 
on immune effector cells, including their effect on Thl cells and cytotoxic lymphocytes. Numerous 
assays for such activity are familiar to those skilled in the art, including the assays disclosed in the 

15 following references, which are incorporated herein by reference: Chapter 3 {In vitro Assays for 
Mouse Lymphocyte Function 3.1-3.19) and Chapter 7 (Immunologic Studies in Humans) in Current 
Protocols in Immunology , J.E. Coligan et al. Eds., Greene Publishing Associates and Wiley- 
Interscience; Takai et al., J. Immunol 137:3494-3500, 1986; Takai et al; J. Immunol. 140:508-512, 
1988; Bertagnolli et al, J. Immunol 149:3778-3783, 1992. 

20 The proteins encoded by the cDNAs of the invention may also be evaluated for their effect 

on dendritic cell mediated activation of naive T-cells. Numerous assays for such activity are 
familiar to those skilled in the art, including the assays disclosed in the following references, which 
are incorporated herein by reference: Guery et al.,J. Immunol 134:536-544, 1995; Inaba et al, 
Journal of Experimental Medicine 173:549-559, 1991; Macatonia et al, Journal of Immunology 

25 154:5071-5079, 1995; Porgador et al, Journal of Experimental Medicine 182:255-260, 1995; Nair 
et al, Journal of Virology 67:4062-4069, 1993; Huang et al, Science 264:961-965, 1994; 
Macatonia et al, Journal of Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al, Journal 
of Clinical Investigation 94:797-807, 1994; and Inaba et al, Journal of Experimental Medicine 
172:631-640, 1990. 

30 The proteins encoded by the cDNAs of the invention may also be evaluated for their 

influence on the lifetime of lymphocytes. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in the following references, which are incorporated 
herein by reference: Darzynkiewicz et al, Cytometry 13:795-808, 1992; Gorczyca et al, Leukemia 
7:659-670, 1993; Gorczyca etal, Cancer Research 53:1945-1951, 1993; Itoh etal, Cell 66:233- 

35 243, 1991; Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et al, Cytometry 
14:891-897, 1993; Gorczyca et al, International Journal of Oncology 1:639-648, 1992. 
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Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et aL, 
Cellular immunology 155:111-122, 1994; Galy et aL, Blood 85:2770-2778, 1995; Toki etaL, Proc. 
Nat. Acad Sci. USA 88:7548-7551, 1991. 
5 Those proteins which exhibit activity as immune system regulators activity may then be 

formulated as pharmaceuticals and used to treat clinical conditions in which regulation of immune 
activity is beneficial. For example, the protein may be useful in the treatment of various immune 
deficiencies and disorders (including severe combined immunodeficiency (SCID)), e.g., in 
regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as effecting 

10 the cytolytic activity of NK cells and other cell populations. These immune deficiencies may be 
genetic or be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or may result from 
autoimmune disorders. More specifically, infectious diseases caused by viral, bacterial, fungal or 
other infection may be treatable using a protein of the present invention, including infections by 
HIV, hepatitis viruses, herpesviruses, mycobacteria, Leishmania spp., malaria spp. and various 

15 fungal infections such as candidiasis. Of course, in this regard, a protein of the present invention 
may also be useful where a boost to the immune system generally may be desirable, i.e., in the 
treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, autoimmune 
thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host disease and 
autoimmune inflammatory eye disease. Such a protein of the present invention may also to be 
useful in the treatment of allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problems. Other conditions, in which immune suppression is desired 

25 (including, for example, organ transplantation), may also be treatable using a protein of the present 
invention. 

Using the proteins of the invention it may also be possible to regulate immune responses, in 
a number of ways. Down regulation may be in the form of inhibiting or blocking an immune 
response already in progress or may involve preventing the induction of an immune response. The 

30 functions of activated T-cells may be inhibited by suppressing T cell responses or by inducing 

specific tolerance in T cells, or both. Immunosuppression of T cell responses is generally an active, 
non-antigen-specific, process which requires continuous exposure of the T cells to the suppressive 
agent. Tolerance, which involves inducing non-responsiveness or anergy in T cells, is 
distinguishable from immunosuppression in that it is generally antigen-specific and persists after 

35 exposure to the tolerizing agent has ceased. Operationally, tolerance can be demonstrated by the 
lack of a T cell response upon reexposure to specific antigen in the absence of the tolerizing agent. 
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Down regulating or preventing one or more antigen functions (including without limitation 
B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high level lymphokine 
synthesis by activated T cells, will be useful in situations of tissue, skin and organ transplantation 
and in graft-versus-host disease (GVHD). For example, blockage of T cell function should result in 
5 reduced tissue destruction in tissue transplantation. Typically, in tissue transplants, rejection of the 
transplant is initiated through its recognition as foreign by T cells, followed by an immune reaction 
that destroys the transplant. The administration of a molecule which inhibits or blocks interaction 
of a B7 lymphocyte antigen with its natural ligand(s) on immune cells (such as a soluble, 
monomeric form of a peptide having B7-2 activity alone or in conjunction with a monomeric form 

10 of a peptide having an activity of another B lymphocyte antigen (e.g., B7-1, B7-3) or blocking 

antibody), prior to transplantation can lead to the binding of the molecule to the natural ligand(s) on 
the immune cells without transmitting the corresponding costimulatory signal. Blocking B 
lymphocyte antigen function in this matter prevents cytokine synthesis by immune cells, such as T 
cells, and thus acts as an immunosuppressant. Moreover, the lack of costimulation may also be 

15 sufficient to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term 
tolerance by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated 
administration of these blocking reagents. To achieve sufficient immunosuppression or tolerance in 
a subject, it may also be necessary to block the function of a combination of B lymphocyte antigens. 
The efficacy of particular blocking reagents in preventing organ transplant rejection or 

20 GVHD can be assessed using animal models that are predictive of efficacy in humans. Examples of 
appropriate systems which can be used include allogeneic cardiac grafts in rats and xenogeneic 
pancreatic islet cell grafts in mice, both of which have been used to examine the 
immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et aL 9 
Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 (1992). 

25 In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven Press, 
New York, 1989, pp. 846-847) can be used to determine the effect of blocking B lymphocyte 
antigen function in vivo on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 

30 reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block costimulation of T 
cells by disrupting receptor ligand interactions of B lymphocyte antigens can be used to inhibit T 
cell activation and prevent production of autoantibodies or T cell-derived cytokines which may be 

35 involved in the disease process. Additionally, blocking reagents may induce antigen-specific 

tolerance of autoreactive T cells which could lead to long-term relief from the disease. The efficacy 
of blocking reagents in preventing or alleviating autoimmune disorders can be determined using a 
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number of well -characterized animal models of human autoimmune diseases. Examples include 
murine experimental autoimmune encephalitis, systemic lupus erythmatosis in MRL/pr/pr mice or 
NZB hybrid mice, murine autoimmuno collagen arthritis, diabetes mellitus in OD mice and BB rats, 
and murine experimental myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, 
5 New York, 1989, pp. 840-856). 

Upregulation of an antigen function (preferably a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response through stimulating B lymphocyte 

10 antigen function may be useful in cases of viral infection. In addition, systemic viral diseases such 
as influenza, the common cold, and encephalitis might be alleviated by the administration of 
stimulatory form of B lymphocyte antigens systemically . 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed APCs 

1 5 either expressing a peptide of the present invention or together with a stimulatory form of a soluble 
peptide of the present invention and reintroducing the in vitro activated T cells into the patient. The 
infected cells would now be capable of delivering a costimulatory signal to T cells in vivo, thereby 
activating the T cells. 

In another application, up regulation or enhancement of antigen function (preferably B 

20 lymphocyte antigen function) may be useful in the induction of tumor immunity. Tumor cells (e.g., 
sarcoma, melanoma, lymphoma, leukemia, neuroblastoma, carcinoma) transfected with a nucleic 
acid encoding at least one peptide of the present invention can be administered to a subject to 
overcome tumor-specific tolerance in the subject. If desired, the tumor cell can be transfected to 
express a combination of peptides. For example, tumor cells obtained from a patient can be 

25 transfected ex vivo with an expression vector directing the expression of a peptide having B7-2-like 
activity alone, or in conjunction with a peptide having B7-l-like activity and/or B7-3-like activity. 
The transfected tumor cells are returned to the patient to result in expression of the peptides on the 
surface of the transfected cell. Alternatively, gene therapy techniques can be used to target a tumor 
cell for transfection in vivo. 

30 The presence of the peptide of the present invention having the activity of a B lymphocyte 

antigen(s) on the surface of the tumor cell provides the necessary costimulation signal to T cells to 
induce a T cell mediated immune response against the transfected tumor cells. In addition, tumor 
cells which lack MHC class I or MHC class II molecules, or which fail to reexpress sufficient 
amounts of MHC class I or MHC class II molecules, can be transfected with nucleic acids encoding 

35 all or a fragment of (e.g., a cytoplasmic-domain truncated fragment) of an MHC class I a chain 
protein and p 2 microglobulin protein or an MHC class II a chain protein and an MHC class II p 
chain protein to thereby express MHC class I or MHC class II proteins on the cell surface. 
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Expression of the appropriate class II or class II MHC in conjunction with a peptide having the 
activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T cell mediated immune 
response against the transfected tumor cell. Optionally, a gene encoding an antisense construct 
which blocks expression of an MHC class II associated protein, such as the invariant chain,can also 
5 be cotransfected with a DNA encoding a peptide having the activity of a B lymphocyte antigen to 
promote presentation of tumor associated antigens and induce tumor specific immunity. Thus, the 
induction of a T cell mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. Alternatively, as described in more detail below, genes 
encoding these proteins or nucleic acids regulating the expression of these proteins may be 
10 introduced into appropriate host cells to increase or decrease the expression of the proteins as 
desired. 

Assaying GENSET proteins or Fragments Thereof for Hematopoiesis Regulating Activity 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
evaluated for their hematopoiesis regulating activity. For example, the effect of the proteins on 
1 5 embryonic stem cell differentiation may be evaluated. Numerous assays for such activity are 

familiar to those skilled in the art, including the assays disclosed in the following references, which 
are incorporated herein by reference: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et 
aL, Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al, Blood 81:2903-2915, 
1993. 

20 The proteins encoded by the cDNAs of the invention or fragments thereof may also be 

evaluated for their influence on the lifetime of stem cells and stem cell differentiation. Numerous 
assays for such activity are familiar to those skilled in the art, including the assays disclosed in the 
following references, which are incorporated herein by reference: Freshney, M.G. Methylcellulose 
Colony Forming Assays, in Culture of Hematopoietic Cells . R.I. Freshney, et al. Eds. pp. 265-268, 

25 Wiley-Liss, Inc., New York, NY. 1994; Hirayama et al.,Proc. Natl. Acad. Sci. USA 89:5907-591 1, 
1992; McNiece, I.K. and Briddell, R.A. Primitive Hematopoietic Colony Forming Cells with High 
Proliferative Potential, in Culture of Hematopoietic Cells . R.I. Freshney, et al. eds. Vol pp. 23-39, 
Wiley-Liss, Inc., New York, NY. 1994; Neben et al., Experimental Hematology 22:353-359, 1994; 
Ploemacher, R.E. Cobblestone Area Forming Cell Assay, In Culture of Hematopoietic Cells. R.I. 

30 Freshney, et al. Eds. pp. 1-21, Wiley-Liss, Inc., New York, NY. 1994; Spooncer, E., Dexter, M. and 
Allen, T. Long Term Bone Marrow Cultures in the Presence of Stromal Cells, in Culture of 
Hematopoietic Cells . R.I. Freshney, et al. Eds. pp. 163-179, Wiley-Liss, Inc., New York, NY. 1994; 
and Sutherland, H.J. Long Term Culture Initiating Cell Assay, in Culture of Hematopoietic Cells . 
R.I. Freshney, et al. Eds. pp. 139-162, Wiley-Liss, Inc., New York, NY. 1994. 

35 Those proteins which exhibit hematopoiesis regulatory activity may then be formulated as 

pharmaceuticals and used to treat clinical conditions in which regulation of hematopoeisis is 
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beneficial. For example, a protein of the present invention may be useful in regulation of 
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell deficiencies. Even 
marginal biological activity in support of colony forming cells or of factor-dependent cell lines 
indicates involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
5 erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy to 
stimulate the production of erythroid precursors and/or erythroid cells; in supporting the growth and 
proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., traditional 
CSF activity) useful, for example, in conjunction with chemotherapy to prevent or treat consequent 

10 myelo-suppression; in supporting the growth and proliferation of megakaryocytes and consequently 
of platelets thereby allowing prevention or treatment of various platelet disorders such as 
thrombocytopenia, and generally for use in place of or complimentary to platelet transfusions; 
and/or in supporting the growth and proliferation of hematopoietic stem cells which are capable of 
maturing to any and all of the above-mentioned hematopoietic cells and therefore find therapeutic 

15 utility in various stem cell disorders (such as those usually treated with transplantion, including, 
without limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in 
repopulating the stem cell compartment post irradiation/chemotherapy, either in-vivo or ex-vivo 
(i.e., in conjunction with bone marrow transplantation or with peripheral progenitor cell 
transplantation (homologous or heterologous)) as normal cells or genetically manipulated for gene 

20 therapy. Alternatively, as described in more detail below, genes encoding these proteins or nucleic 
acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Regulation of Tissue Growth 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
25 evaluated for their effect on tissue growth. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in International Patent Publication No. 
WO95/16035, International Patent Publication No. WO95/05846 and International Patent 
Publication No. WO91/07491, which are incorporated herein by reference. 

Assays for wound healing activity include, without limitation, those described in: Winter, 
30 Epidermal Wound Healing , pps. 71-112 (Maibach, HI and Rovee, DT, eds.), Year Book Medical 
Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 71:382-84 (1978) 
which are incorporated herein by reference. 

Those proteins which are involved in the regulation of tissue growth may then be 
formulated as pharmaceuticals and used to treat clinical conditions in which regulation of tissue 
35 growth is beneficial. For example, a protein of the present invention also may have utility in 

compositions used for bone, cartilage, tendon, ligament and/or nerve tissue growth or regeneration, 
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as well as for wound healing and tissue repair and replacement, and in the treatment of burns, 
incisions and ulcers. 

A protein of the present invention, which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone fractures 
5 and cartilage damage or defects in humans and other animals. Such a preparation employing a 
protein of the invention may have prophylactic use in closed as well as open fracture reduction and 
also in the improved fixation of artificial joints. De novo bone formation induced by an osteogenic 
agent contributes to the repair of congenital, trauma induced, or oncologic resection induced 
craniofacial defects, and also is useful in cosmetic plastic surgery. 

10 A protein of this invention may also be used in the treatment of periodontal disease, and in 

other tooth repair processes. Such agents may provide an environment to attract bone-forming 
cells, stimulate growth of bone-forming cells or induce differentiation of progenitors of bone- 
forming cells. A protein of the invention may also be useful in the treatment of osteoporosis or 
osteoarthritis, such as through stimulation of bone and/or cartilage repair or by blocking 

15 inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes. 

Another category of tissue regeneration activity that may be attributable to the protein of the 
present invention is tendon/ligament formation. A protein of the present invention, which induces 
tendon/ligament-like tissue or other tissue formation in circumstances where such tissue is not 

20 normally formed, has application in the healing of tendon or ligament tears, deformities and other 
tendon or ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing damage to 
tendon or ligament tissue, as well as use in the improved fixation of tendon or ligament to bone or 
other tissues, and in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like 

25 tissue formation induced by a composition of the present invention contributes to the repair of 

congenital, trauma induced, or other tendon or ligament defects of other origin, and is also useful in 
cosmetic plastic surgery for attachment or repair of tendons or ligaments. The compositions of the 
present invention may provide an environment to attract tendon- or ligament-forming cells, 
stimulate growth of tendon- or ligament-forming cells, induce differentiation of progenitors of 

30 tendon- or ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo 
for return in vivo to effect tissue repair. The compositions of the invention may also be useful in the 
treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The 
compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is 
well known in the art. 

35 The protein of the present invention may also be useful for proliferation of neural cells and 

for regeneration of nerve and brain tissue, i.e., for the treatment of central and peripheral nervous 
system diseases and neuropathies, as well as mechanical and traumatic disorders, which involve 
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degeneration, death or trauma to neural cells or nerve tissue. More specifically, a protein may be 
used in the treatment of diseases of the peripheral nervous system, such as peripheral nerve injuries, 
peripheral neuropathy and localized neuropathies, and central nervous system diseases, such as 
Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
5 Drager syndrome. Further conditions which may be treated in accordance with the present 

invention include mechanical and traumatic disorders, such as spinal cord disorders, head trauma 
and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from chemotherapy 
or other medical therapies may also be treatable using a protein of the invention. 

Proteins of the invention may also be useful to promote better or faster closure of non- 
10 healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

It is expected that a protein of the present invention may also exhibit activity for generation 
or regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium) muscle (smooth, skeletal or cardiac) and vascular (including vascular 
15 endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring to allow normal tissue to 
generate. A protein of the invention may also exhibit angiogenic activity. 

A protein of the present invention may also be useful for gut protection or regeneration and 
treatment of lung or liver fibrosis, reperfusion injury in various tissues, and conditions resulting 
20 from systemic cytokine damage. 

A protein of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Alternatively, as described in more detail below, genes encoding these proteins or nucleic 
25 acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Regulation of Reproductive Hormones or 
Cell Movement 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
30 evaluated for their ability to regulate reproductive hormones, such as follicle stimulating hormone. 
Numerous assays for such activity are familiar to those skilled in the art, including the assays 
disclosed in the following references, which are incorporated herein by reference: Vale et al., 
Endocrinology 91 :562-572, 1972; Ling et al. 9 Nature 321 :779-782, 1986; Vale et al. 9 Nature 
321 :776-779, 1986; Mason et al. 9 Nature 318:659-663, 1985; Forage et al. 9 Proc. Natl. Acad. Sci. 
35 USA 83:3091-3095, 1986. Chapter 6.12 (Measurement of Alpha and Beta Chemokines) Current 
Protocols in Immunology . J.E. Coligan et al. Eds. Greene Publishing Associates and Wiley- 



463 



WO 01/42451 PCT/IB00/01938 
Intersciece ; Taub et al J. Clin. Invest 95:1370-1376, 1995; Lind et al APMIS 103:140-146, 1995; 
Muller etal Eur, J, Immunol 25:1744-1748; Gruber e/a/. J. of Immunol 152:5860-5867, 1994; 
Johnston etal J. of Immunol 153:1762-1768, 1994. 

Those proteins which exhibit activity as reproductive hormones or regulators of cell 
5 movement may then be formulated as pharmaceuticals and used to treat clinical conditions in which 
regulation of reproductive hormones or cell movement are beneficial. For example, a protein of the 
present invention may also exhibit activin- or inhibin-related activities. Inhibins are characterized 
by their ability to inhibit the release of follicle stimulating hormone (FSH), while activins are 
characterized by their ability to stimulate the release of folic stimulating hormone (FSH). Thus, a 

10 protein of the present invention, alone or in heterodimers with a member of the inhibin a family, 
may be useful as a contraceptive based on the ability of inhibins to decrease fertility in female 
mammals and decrease spermatogenesis in male mammals. Administration of sufficient amounts of 
other inhibins can induce infertility in these mammals. Alternatively, the protein of the invention, 
as a homodimer or as a heterodimer with other protein subunits of the inhibin-B group, may be 

15 useful as a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating 
FSH release from cells of the anterior pituitary. See, for example, United States Patent 4,798,885, 
the disclosure of which is incorporated herein by reference. A protein of the invention may also be 
useful for advancement of the onset of fertility in sexually immature mammals, so as to increase the 
lifetime reproductive performance of domestic animals such as cows, sheep and pigs. 

20 Alternatively, as described in more detail below, genes encoding these proteins or nucleic 

acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Chemotactic/Chemokinetic Activity 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
25 evaluated for chemotactic/chemokinetic activity. For example, a protein of the present invention 
may have chemotactic or chemokinetic activity (e.g., act as a chemokine) for mammalian cells, 
including, for example, monocytes, fibroblasts, neutrophils, T-cells, mast cells, eosinophils, 
epithelial and/or endothelial cells. Chemotactic and chmokinetic proteins can be used to mobilize 
or attract a desired cell population to a desired site of action. Chemotactic or chemokinetic proteins 
30 provide particular advantages in treatment of wounds and other trauma to tissues, as well as in 

treatment of localized infections. For example, attraction of lymphocytes, monocytes or neutrophils 
to tumors or sites of infection may result in improved immune responses against the tumor or 
infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
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Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 

The activity of a protein of the invention may, among other means, be measured by the 
following methods: 

5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhension of one cell population 
to another cell population. Suitable assays for movement and adhesion include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J.E. Coligan, A.M. Kruisbeek, D.H. 
10 Margulies, E.M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience 
(Chapter 6.12, Measurement of alpha and beta Chemokincs 6.12.1-6.12.28; Taub et al J. Clin. 
Invest. 95:1370-1376, 1995; Lind et al APMIS 103:140-146, 1995; Muellers al Eur. J. Immunol. 
25:1744-1748; Gruber et al J. of Immunol. 152:5860-5867, 1994; Johnston et al J. of Immunol, 
153:1762-1768, 1994. 

15 Assaying GENSET proteins or Fragments Thereof for Regulation of Blood Clotting 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
evaluated for their effects on blood clotting. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in the following references, which are incorporated 
herein by reference: Linet et al , J. Clin. Pharmacol 26:131-140, 1986; Burdick et al , Thrombosis 
20 Res. 45:413-419, 1987; Humphrey et al, Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

Those proteins which are involved in the regulation of blood clotting may then be 
formulated as pharmaceuticals and used to treat clinical conditions in which regulation of blood 
clotting is beneficial. For example, a protein of the invention may also exhibit hemostatic or 

25 thrombolytic activity. As a result, such a protein is expected to be useful in treatment of various 
coagulations disorders (including hereditary disorders, such as hemophilias) or to enhance 
coagulation and other hemostatic events in treating wounds resulting from trauma, surgery or other 
causes. A protein of the invention may also be useful for dissolving or inhibiting formation of 
thromboses and for treatment and prevention of conditions resulting therefrom (such as, for 

30 example, infarction of cardiac and central nervous system vessels (e.g., stroke)). Alternatively, as 
described in more detail below, genes encoding these proteins or nucleic acids regulating the 
expression of these proteins may be introduced into appropriate host cells to increase or decrease 
the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Involvement in Receptor/Li gand Interactions 

35 The proteins encoded by the cDNAs or a fragment thereof may also be evaluated for their 

involvement in receptor/ligand interactions. Numerous assays for such involvement are familiar to 
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those skilled in the art, including the assays disclosed in the following references, which are 
incorporated herein by reference: Chapter 7.28 (Measurement of Cellular Adhesion under Static 
Conditions 7.28.1-7.28.22) in Current Protocols in Immunology . J.E. Coligan et al. Eds. Greene 
Publishing Associates and Wiley-Interscience; Takai et al.,Proc. Natl. Acad. Sci. USA 84:6864- 
6868, 1987; Bierer et al.,J. Exp. Med. 168:1 145-1 156, 1988; Rosenstein et al, J. Exp. Med. 
169:149-160, 1989; Stoltenborg et al, J. Immunol. Methods 175:59-68, 1994; Stitt et al, Cell 
80:661-670, 1995; Gyuris^a/., Cell 75:791-803, 1993. 

For example, the proteins of the present invention may also demonstrate activity as 
receptors, receptor ligands or inhibitors or agonists of receptor/ligand interactions. Examples of 
such receptors and ligands include, without limitation, cytokine receptors and their ligands, receptor 
kinases and their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell 
interactions and their ligands (including without limitation, cellular adhesion molecules (such as 
selectins, integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, 
antigen recognition and development of cellular and humoral immune respones). Receptors and 
ligands are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

Assaying GENSET proteins or Fragments Thereof for Anti-Inflammatory Activity 

20 The proteins encoded by the cDNAs or a fragment thereof may also be evaluated for anti- 

inflammatory activity. The anti-inflammatory activity may be achieved by providing a stimulus to 
cells involved in the inflammatory response, by inhibiting or promoting cell-cell interactions (such 
as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the 
inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or suppressing 

25 production of other factors which more directly inhibit or promote an inflammatory response. 

Proteins exhibiting such activities can be used to treat inflammatory conditions including chronic or 
acute conditions), including without limitation inflammation associated with infection (such as 
septic shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia- 
reperfusioninury, endotoxin lethality, arthritis, complement-mediated hyperacute rejection, 

30 nephritis, cytokine or chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease 
or resulting from over production of cytokines such as TNF or IL-1 . Proteins of the invention may 
also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 

Assaying GENSET proteins or Fragments Thereof for Tumor Inhibition Activity 

The proteins encoded by the cDNAs of the invention or a fragment thereof may also be 

35 evaluated for tumor inhibition activity. In addition to the activities described above for 

immunological treatment or prevention of tumors, a protein of the invention may exhibit other anti- 
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tumor activities. A protein may inhibit tumor growth directly or indirectly (such as, for example, 
via ADCC). A protein may exhibit its tumor inhibitory activity by acting on tumor tissue or tumor 
precursor tissue, by inhibiting formation of tissues necessary to support tumor growth (such as, for 
example, by inhibiting angiogenesis), by causing production of other factors, agents or cell types 
5 which inhibit tumor growth, or by suppressing, eliminating or inhibiting factors, agents or cell types 
which promote tumor growth. 

A protein of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing or 

10 enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape (such 
as, for example, breast augmentation or diminution, change in bone form or shape); effecting 
biorhythms or circadian cycles or rhythms; effecting the fertility of male or female subjects; 
effecting the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of 

15 dietary fat, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional factors or 
component(s); effecting behavioral characteristics, including, without limitation, appetite, libido, 
stress, cognition (including cognitive disorders), depression (including depressive disorders) and 
violent behaviors; providing analgesic effects or other pain reducing effects; promoting 
differentiation and growth of embryonic stem cells in lineages other than hematopoietic lineages; 

20 hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of the enzyme and 
treating deficiency-related diseases; treatment of hyperproliferative disorders (such as, for example, 
psoriasis); immunoglobulin-like activity (such as, for example, the ability to bind antigens or 
complement); and the ability to act as an antigen in a vaccine composition to raise an immune 
response against such protein or another material or entity which is cross-reactive with such protein. 
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Ol 

oJ 


11/1 H7C 7 O 1 /^Q 
1 14-UZo-Z-U-t 1-to 


F^XT A 


prsiuescnptu oiv- | 


0/1 
o4 


1 i/i ni7 1 n um r^Q 

1 1 4-UJZ- 1 -U-ri 1 U-to 


FlXI A 


prsiuescnptu ojv- , 


O^ 


11/I (\ai 7 n a 1 n r^Q 

1 14-U4j-Z-U-/VlU-to 


FiXI A 


1 prsiuescnptu oiv- 


o/: 
oO 


11/1 H/i/i 1 n r^Q 

1 14-U44- 1 -U-tD-to 


F)XT A 
UiN/V 


r\T)1i iccr»T-i-rvtTT QF 

prsiuescnptu on- 


o / 


1 1 a nni 7 n 1^1 n 

1 1 0-UU 5 -5 -U-D 1 U-to 


FiXT A 


prsiuescnptu oiv- 


QO 
OO 


1 1 /: OOI ^ f\ n 1 7 r^Q 
1 1 0-UU j - j -U-u I Z-to 


FiXT A 


prsiuescnptu oiv- 


CO 


1 \ f<. m 1 7 n T71 1 r^Q 

1 1 0-U 1 1 -Z-U-r 1 1 -to 


F)XT A 


prsiuescnptu ojv- 


yu 


1 1 6 nil 1 n Fzi 


F»XI A 


r-\T)1i i£»cr'i-i*A+TT Q \{ 

prsiuescnptu ojv- 


Q1 

y i 


1 1 6 n^.1 a. n 


FiXI A 
L-'IN/V 


pr>iuescnpiii ojv- 


Q7 

yz 


1 1 6 nAA 7 n rvi r^Q 

1 1 0-U44- Z-U-t4-to 


FJXI A 
UrN/V 


prsiuescnpni ojv- 


Ol 

yj 


1 1 fy n7^ 1 n fa r^Q 

1 1 O-U / Z>- 1 -U-xlO-to 


FiXJ A 
UIN/V 


prsiuescnptu oiv- 


Q/l 

y4 


1 lO-Uy4-4-U-lJD-to 


FiXT A 
UIN/V 


prsiuescnptu ojv- 


yj 


1 1 7 nn*\ 1 n F7 p'Q 

1 1 /-UUJ-j-U-rZ-to 


FiXI A 
UIN/V 
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1 7 1 -nn7-^-n-D0-p^s 






97 


145-91-3-0-D10-CS 


DNA 


pBluescriptll SK- 


98 


157-1 7-1 -0-F4-CS 


DNA 


pBluescriptll SK- 


99 


160-11 -3 -0-G8-CS 


DNA 


pBluescriptll SK- 


100 


1 60-24- 1-0-F12-CS 


DNA 


1 pBluescriptll SK- 
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1 A1 
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i An O/i o n T70 pc 
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T~MvT A 


prSluescnptll oJv- 


i no 
102 


1 An /i n tyo pc 
1 00-25 -4-U-DZ-to 


TMVT A 


prSluescnptll oJv- 


1 m 
103 


1 An 11 in A 1 1 PC 
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JJINA 
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1 f\A 

104 i 


1 An io 1 n tta pc 
1 oU-JZ- 1 -U-ro-to 


1JJN A 


prSluescnptll oJv- 


1 nc 
105 


1 An T7 i n ai pc 
1 60-3 /- 1 -U-Aj-to 


JJINA 


prSluescnptll oJv- 


1 nA ! 
106 ! 


1 An /in i n ro pc 
1 60-40-3 -0-.b9-t o 


JJINA 


prSluescnptll oJv- 


i a^7 

107 


1 Hf\ CO "5 n T7 A PC 

160-58-3-0-b4-to 


TYXT A 

UNA 


pBluescnptll oK- 


1 AO 

108 


1 An o c i n TY/i pc 
1 60-85-3-0-D4-to 


"P\XT A 

JJINA 


prSluescnptll oK- 


1 aa 

109 


1 An nc in a 1 1 pc 
1 60-95-3 -0-A1 1-to 


TVNT A 
JJINA 


pJDluescnptll oJv- 


1 10 


1 zro 1 a /i a t?a /^o . 

1 62- 1 0-4-0-r 9-t o .cor 


DNA 


pBluescnptll oK- 


111 
111 


1 1 A A A T7A PC 

162- 10-4-0-^9-00. ir 


TYXT A 

DNA 


pBluescnptll oK- 


1 1 o 
1 12 


1 O/l 1 1 o n ~C A PC 

1 /4-13-2-0-Jb4-to 


TYNT A 
JJN A 


„TY^T 

pFl 


ill 
1 13 


>I/C T rt D1 1 pc 

1 / 4-46-2 -0-B 1 1 -to 


TYXT A 
JJN A 


pFl 


1 14 


1 on o o n a a pc 
1 /y-o-Z-U-Ao-to 


TYXT A 
JJNA 


pBluescnptll oK- 


1 1 c 

1 15 


1 OA OO 1 A T>A pc 

1 80-22-3-0-B6-to 


TYTvT A 

JJNA 


pBluescnptll oK- 


i i zr 

1 16 


1 O 1 1 "> 1 A T7^7 PC 

181-13-1-0-rv-to 


TYNT A 

UNA 


pBluescnptll oK- 


1 1 *7 

1 17 


1 O 1 1 C A n L"7 PC 

181-15 -4-0-r /-to 


TYXT A 

JJNA 


pBluescnptll oK- 


1 1 o 

118 


i o i on 1 /\ /^i pc 

181 -20- 1 -0-Cj 7-to 


TYKT A 

DNA 


pBluescnptll oK- 


1 1 o 

1 19 


1 O/i 1 c i a r\i PC 
1 84- 1 5 -3-0-D 1 -to 


TYXT A 

JJNA 


MDIimn/^MMiTT CT/ 

pBluescnptll oK- 


120 


1 O *7 1 *Y O A 1 1 /""'C 

187-12-2-0-CjI 1-to 


TYXT A 

DNA 


„T>1 J„4-TT CT/ 

pBluescnptll oK- 


1 o 1 
121 


1 OO O O A A 1 O PC 

1 87-2-2-0-A 1 2-to 


TYXT A 

DNA 


«D1,,«^«««*TT CT/ 

pBluescnptll oK- 


1 T> 

122 


1 C7 o a n n pc 

1 87-30-0-0-k23-to 


TYXT A 

DNA 


pBluescnptll oJv- 


123 


1 O *7 /f A A 1 A /'""'C 

1 87-3 6-0-0-e 1 9-to 


TYXT A 

DNA 


TY 1 . . *m ■- * -- 4-T T C 1/ 

pBluescnptll oK- 


124 


1 O ^7 O O A A /"""'C 

1 87-38-0-0-d22-to 


TYXT A 

DNA 


T*> 1 ' i.TT C 1 T/ 

pBluescnptll oK- 


1 ^ c 

125 


1 0^7 O A /\ A *l— A O 

1 87-39-0-0-b9-to 


TYXT A 

DNA 


ni,,^ ^ ' 4TT CT/ 

pBluescnptll oK- 


126 


1 0*7 OA A A ™A PC 

1 87-39-0-0-g6-to 


TYXT A 

DNA 


pBluescnptll oK- 


1 T7 

127 


10^7 AC A A 11 O 

1 87-45-0-0-1 1 8-to 


TYXT A 

DNA 


T> 1 . . ,^ ^ ... ^ ♦ T I C T/ 

pBluescnptll oK- 


1 

128 


1 O T ylf A A — O 1 /^O 

1 87-45-0-0-m2 1 -to 


TYXT A 

DNA 


pBluescnptll oK- 


1 'ia 

129 


1 0*7 AC A A O 

1 87-45-0-0-n8-to 


TYXT A 

DNA 


1 , _ ^ ~. ~ T — 4-TT C "1/ 

pBluescnptll oK- 


1 "> A 

130 


1 C7 AH A A -COO PC 

1 87-46-0-0-123 -to 


TYXT A 

DNA 


. - TY 1_ - _ — j. TT C T/ 

pBluescnptll oK- 


1 o 1 
131 


1C7C 1 A A 1 O PC 

1 87-5-1 -0-A1 2-to 


TYXT A 

DNA 


pBluescnptll oK- 


132 


1 OO C 1 A "C/C PC 

1 87-5-1 -0-r , 6-to 


TYXT A 
DNA 


nDlimn/tmn^TT C T/ 

pBluescnptll oK- 


133 


1 0^7 C O A "DO P C 

1 87-5-2-0-B2-to 


TYXT A 

DNA 


T) 1 . , ,^ ^ T I CT/ 

pBluescnptll oK- 1 


134 


1 O ^7 C O A TYC PC 

1 87-5-3-0-D5-to 


TYXT A 

DNA 


pBluescnptll oK- 


nc 

135 


1 OO C 1 A A -TA /~^C 

1 8 /-5 1 -0-0-19-t o 


TYXT A 

DNA 


pBluescnptll oK- 


1 1 A 

136 


1 OO A 1 n "DO PC 

18/ -6- 1 -0-r>9-to 


TYXT A 

DNA 


pBluescnptll oJs.- 


1 1 O 

13 / 


i on A /I n PI A PC 

18/ -6-4-0-t 1 0-to 


TYXT A 

DNA 


pBluescnptll oK- 


1 to 

138 


1 OO 1 A O A PO PC 

188-19-2-0-t8-to 


TYXT A 

DNA 


nDlimnnoi'-nfTT C T/ 

pBluescnptll oK- 


1 OA 

139 


1 OO OO A A P/T PC 

1 88-22-4-0-Cj6-to 


TYXT A 

DNA 


T) 1 . , ,^ ^ ' ♦ T ¥ C T/ 

pBluescnptll oK- 


140 


1 OO OO A A T"\ 1 1 /""'C 

188-28-4-0-D1 1-to 


TYXT A 

DNA 


pBluescnptll oK- 


i/ii 
141 


ioo on 1 n T7 1 a pc 
1 88-29-1 -0-.bl0-to 


TYXT A 
DNA 


pBluescnptll oJv- 


142 


100 i/i a n cc pc 
1 88-34-4-0-.b5-to 


TYXT A 
DNA 


*%D1nar</««v\4TT CT/ 

pBluescnptll oJv- 


1 A1 

143 


1 oo o i n a c pc 
1 88-9-3-0-A5-to 


TYXT A 

DNA 


„ TT) 1 , , nn/^vi^fll CT/ 

pj^luescriptll oJv- 


1 /M 

144 


i ac no i i n pi pc 
105-02 1 -j-U-tJ-to 


T^XT A 

DNA 


pBluescnptll oJv- 


145 


1 ac mo a n xtio pc 
105-03 /-4-U-rl 1 z-to 


T^XT A 

DNA 


«D1nnn/t««^TT CT/ 

pBluescnptll oK- 


1 /I A 

146 


1 ac noi on ao pc 
1 05-073-2 -0-A7-to 


TYXT A 

DNA 


pBluescnptll oK- 


1 /I O 

14 / 


1 aa nno a n pa pc 
1 09-002-4-0-t 6-t o 


TYXT A 
DNA 


«D1,,«n^ m '«fTT CT/ 

pBluescnptll oJv- 


1 A# 
14o 


mo nm i n cia pq 


JJINA 


prsiuescnpiii oiv- 


149 


116-118-4-0-A8-CS 


DNA 


pBluescriptll SK- 


150 


145-52-2-0-D12-CS 


DNA 


pBluescriptll SK- 


151 


145-7-2-0-G5-CS 


DNA 


pBluescnptll SK- 


152 


145-7-3-0-D3-CS 


DNA 


pBluescriptll SK- 
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1 

1 JJ 


1 IT 1 O AM A"*0 

15 /-I / -Z-U-L 1 -Lo 


1JJNA 


poiuescnptll oK- 


1 ^A 
154 


i ao 1 ni in a^q 
1 OU- lUl-j -U-liZ-t o 




prsiuescnptii oJv- 


1 c< 
155 


1 ao n i o r*io A'o 
1 OU- 1 Z- 1 -U-JJ 1 U-to 


r*XT A 


prsiuescnptii oJv- 


1 c/r 
1 JO 


1 AO /I o am a*o 
1 OU-Zo-4-U-t4-to 




poiuescnptii oJv- 


ID/ 


i ao 7i in TTzi r^Q 

1 OU-o 1 -j-U-li4-tO 


TYK\ A 


prsiuescnptii oJv- 


1 <e 
15o 


1 AO AC\ 1 O T-J/l A'o 
1 OU-4U- 1 -U-xi4-to 




prsiuescnptii oJv- 


1 CQ 

i5y 


i ao ^/i i o rr^ r^c 
1 OU-J4- 1 -U-r / -to 




prsiuescnptii ojv- 


160 


1 /CO OO 1 O AO /^O 

1 60-88-3-0-A8-CS.cor 


"T\XT A 

L) JN A 


prsiuescnptii oK- 


161 


1 AO OOIO A O A'O -f*- 

160-oo-3-0-Ao-to.ir 


T\XT A 
JJ JN A 


prsiuescnptii oJs.- 


162 


1 /CO OO A O TZT /I /^C 

1 60-99 -4-0-ri,4-to 


1JJNA 


prsiuescnptii oK- 


1 A"} 

163 


1A1 C A O T>A A'O 

16 l-5-4-0-rs6-to 


JJ JN A 


prsiuescnptii oK- 


1 A/1 

164 


1 T/1 1 *7 1 O "P\ A /^O 

1 /4-1 /-l-0-JJ6-to 


"T\XT A 

DJNA 


pPT 


1 /CC 

165 


H/1 n A f\ T70 /~^o 

1 /4-32-4-0-ro-to 


1JJNA 


pFl 


1 AA 

166 


n-i 1Q /I A FM 1 A'O 
1 /4-35-4-0-JJ1 1 -to 


r\\T A 
JJJNA 


pr 1 


i an 
16 / 


H/1 0 0 0 1 0 A'O 
1 /4-o-2-0-t 10-to 


"P\XT A 

JJJNA 


pFl 


1 no 

168 


1 HC\ 1 A 1 A T7 1 1 /^O 

1 79-14-2 -0-r 1 1 -to 


T\XT A 
UNA 


prsiuescnptii oK- 


1 /CO 


1 TO O A 0 r> 0 r^o 
1 /9-9-4-0-r>8-Cb 


rv\T a 

JJJNA 


pJ3luescnptll oJs.- 


1 Hf\ 

170 


IOI IOIO A^O /^C 

181-10-1 -0-ty-to 


"P\XT A 

JJJNA 


prsiuescnptii oJv- 


1 H 1 
1/1 


1 QH C ^ O C^H rf^O 

18 /-5-3-0-t /-to 


"P\XT A 

JJJNA 


prsiuescnptii oJs.- 


1 *71 

1 /2 


1 OO OA /i O ITC i^O 

1 8 5-26-4-0-r 5 -to 


JJJNA 


prsiuescnptii oJv- 


1 HI 

173 


IOO T7 1 O 1 /^O 

1 88-27-3-0-CjI -to 


T\XT A 

JJJNA 


prsiuescnptii oK- 


1 ^7/1 

174 


IOO TO T O O 1 /~*C} 

1 88-29-2 -0-H 1 -to 


JJJNA 


prsiuescnptii oJv- 


1 *7C 

175 


1 OO 11 1 A "CZT /^O 

188-31-1 -0-r,6-to 


JJJNA 


prsiuescnptii oK- 


176 


1 OO AC 1 A /^O 

1 88-45 - 1 -U-JJ 3 -to 


JJJNA 


prsiuescnptii oJs.- 


1 T7 

177 


IOO r 1 f\ TT/T /^O 

1 8 8-5 - 1 -0-rl6-t 0 


nxi A 
JJJNA 


prsiuescnptii oJv- 


1 TO 

17o 


100 010 n a /^o 
1 88-9- 1 -0-t 1 U-to 


"T\XT A 

JJJNA 


prsiuescnptii oK- 


1 *70 

17^ 


1 Af A1A 1 A A'C f~^Q 

105-0 16-3 -0-t 5 -to 


"P\XT A 

JJJNA 


prsiuescnptii oJs.- 


1 OA 

180 


1 AC AO/C A A T"\0 Z^O 

105-026-4-0-JJy-to 


JJJNA 


prsiuescnptii oK- 


1 O 1 

181 


IOC OC3 T O T\0 /^C 

105-053-2-0-JJ9-to 


"P\XT A 

JJJNA 


\ prsiuescnptii oK- 


1 o 

182 


1 AC A/CO 1 A A 1 1 f^C 

1 05-069-3 -0-A1 1-to 


T\XT A 

JJJNA 


prsiuescnptii oJs.- 


1 oi 

183 


1 AC AT/C A A TTA f^O 

105-0 /6-4-0-r6-to 


T\XT A 

JJJNA 


prsiuescnptii oK- 


1 O A 

184 


1 AC 11C T A "CO A'O 

1 05 - 1 35 -2 -0-r 9-to 


"P^XT A 

JJJNA 


prsiuescnptii olv- 


1 OC 

185 


1 OA OT1 A O TTA A*0 

1 06-023 -4 -U-r 6 -to 


r\\T a 
JJJNA 


MDInano^infTT OTA 

prsiuescnptii oJs.- 


1 QH 

186 


110 OOI 1 O 1 1 /^C 

1 1 0-00 1 -3-0-t 1 1 -to 


r\XT A 

JJJNA 


prsiuescnptii oK- 


1 07 

187 


IIO OOT 1 O TTO A'O 

1 10-002-3 -0-r 9-to 


"P\XT A 

JJJNA 


prsiuescnptii oiv- 


1 oo 
1 88 


11/1 OIO 1 O T\Q A'O 

1 14-0 19-3-0-U9-to 


T*XT A 

JJJNA 


»-»TD liirtr-^»»-i*^+TT OTA 

prsiuescnptii oJv- 


1 oo 
1 oy 


11/1 OTO 1 O A'A A'O 

1 14-029- l-0-t6-to 


r\\T a 

JJJNA 


«\D1iiar<omMfTT OTA 

prsiuescnptii oJv- 


1 OO 

iyo 


ii/i /i 0 r> 1 a^o 
1 14-032-4-0-15 1-to 


T*XT A 

JJJNA 


nTH In aort m-n •fTT OTA 

prsiuescnptii orv- 


1 O 1 

191 


11/1 0*70 T O T-T/l A*0 

1 14-0 /0-2-0-H4-to 


T\XT A 
JJJNA 


»-v 1 . , ! ^ »- i f T T OTA 

prsiuescnptii oJv- 


i m 
192 


11A OIA 1 O T7 1 1 A^C 

1 1 6-0 1 6-3-0-r 1 1 -to 


"T*XT A 

JJJNA 


prsiuescnptii oJs.- 


i m 
193 


1 1 A OTT A O A") A*C 

1 16-022-4-0-o2-to 


T^XT A 

JJJNA 


«"D1niaf^«-i<rtfTT OTA 

prsiuescnptii oJv- 


1 O/l 

194 


I 1A O^T T O TJO A'C 

I I o-U jZ-Z-U-rio-to 


"T^XT A 
JJJNA 


nDlnaoomnfTT OTA 

prsiuescnptii ojv- 


1 oc 
195 


1 1 A OC1 A O V>A A*C 

1 1 O-U J J -4-U-Jt>4-tO 


JJJNA 


*>D1iiannii'v\4-TT OTA 

prsiuescnptii oJv- 


1 OA 

196 


1 lo-U94-3-U-rlZ-to 


T^XT A 

JJJN/Y 


t-\T3 1 1 laronnfTT OTA 

prsiuescnptii oiv- 


1 fl'7 

19 / 


11A 1 n /! O A"7 A*C 

116-1 12-4-0-t/-to 


JJJNA 


*>"D1iiiip^tM'«tTT OTA 

prsiuescnptii oJs.- 


1 oo 
198 


1 1 A 1 Tl 1 O T71 T A'C 
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r\M A 
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1 oo 
199 


1 T5 OOO 1 O A^C A^O 

123-008-1 -0-t5-to 


"T\XT A 

JJJNA 


«\T}liiAr>/Mt'*\fTT oy 

prsiuescnptii oJv- 


zuu 


J-Z-U-rio-tO 


TYM A 
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201 


145-57-2-0-C9-CS.cor 


DNA 


pBluescriptll SK- 
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145-57-2-0-C9-CS.fr 


DNA 


pBluescriptll SK- 


203 


145-7-3-0-B12-CS 


DNA 


pBluescriptll SK- 


204 


1 57-12-2 -0-D1-CS 


DNA 


pBluescriptll SK- j 
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r\\T a 

DNA 
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DNA 
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1 £LC\ 1 A 1 1 A D1A 'O 
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UJNA 


pBluescnptll SK- • 


208 


1 £LC\ 1 A/I /I A "C5 /^C" 

1 60- 1 04-4-0-r 3 -CS 


DNA 


pBluescnptll SK- , 


O AA 

209 


1 /xa n o a r\i a /^o 
1 oO-zz-2-O-D 1 0-CS 


DNA 
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pBluescnptll SK- 


1 1 A 

210 


loO-z4o-U-r 12-CS 


DNA 


pBluescnptll SK- 


T1 1 

211 


160-3-2-0-H3-CS 


r\\T a 

DNA 


pBluescnptll SK- 


212 


1 yTA C O ^ A A y— 'O 

1 60-5 8-2-0-A2-CS 


DNA 


pBluescnptll SK- 


213 


1 f f\ 11 1 /\ T> /I y— 'O 

1 60-73 - 1 -0-B4-CS 


DNA 


pBluescnptll SK- 


214 


1 60-75 -4-0-E6-CS 


DNA 


T% 1 J.TT O ~tr 

pBluescnptll SK- 


^ i c 
215 


1 60-9 7-3 -0-E9-CS 


DNA 


pBluescnptll SK- 


216 


1 *7/l 1 /I A TTA y^C 

1 74- 1 -4-0-E9-CS 


r\\T a 

DNA 


— .T>T 

pPT 


217 


1 1 A 1 ^ A A y~" ^O 

1 74- 1 2-4-0-C2-CS 


DNA 


pPT 


^> 1 o 

218 


1 OA 1 A A A TTl 'O 

1 80- 1 9-4-0-H2-CS 


DNA 
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219 
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DNA 
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220 


1 O 1 *■» ^ f\ T^/T" y— *C 

181-3-2-0-F6-CS 


DNA 


T-» 1 , . ■ i.TT O TT' 

pBluescnptll SK- 


221 


lOI /t /t A A1^> /*" 'O 

181 -4 -4-0 -A 1 2-CS 


r\\T a 

DNA 


pBluescnptll SK- 


222 


lOI A A T? 1 'I y~">0 

1 81-9-2 -0-F1 2-CS. cor 


T"\XT A 

DNA 


T> 1 " iTT O ~X7" 

pBluescnptll SK- 


223 


1 81-9-2 -0-P 12-CS.tr 


r\\T a 

DNA 
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pBluescnptll SK- 


224 


iO/i i o "7 a ri 1 y— *c 
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pBluescnptll SK- 


't or\ 

380 


1 88-22-4-0-G6-CS 


PRT 


Til '. i.TT OT/ 

pBluescnptll SK- 


381 


i oo ^ o a r\ rvi i /^c 

1 88-28-4 -0-D1 1-CS 


PRT 


T»1 ~ iTT OT/ 

pBluescnptll SK- 


382 


188-29-1-0-E10-CS 


PRT 


T»1 iTT CT/ 

pBluescnptll SK- 


I 383 


1 88-34-4-0-E5-CS 


PRT 


T"»1 • iTT CT/ 

pBluescnptll SK- ; 


384 


1 O O /"\ 1 /\ A C v^C 

188-9-3-0-A5-CS 


PRT 


pBluescnptll SK- 


385 


1 05-02 1 -3 -0-C3 -CS 


PRT 


pBluescnptll SK- 


386 


105-037-4-0-H12-CS 


PRT 


T*» 1 i TT O T/ 

pBluescnptll SK- 


387 


105-073-2 -0-A7-CS 


PRT 


T*» 1 iTT O T/ 

pBluescnptll SK- 


o o o 

388 


1 09-002-4-0-C6-CS 


PRT 


Tp> 1 „ „ _ ' i_TT CT/ 

pBluescnptll SK- 


389 


1 09-003 - 1 -0-G4-CS 


PRT 


Tl 1 iTT T/ - 

pBluescnptll SK- 


390 


1 1 X" 1 1 O AO /~<fi 

1 16-1 18-4-0-A8-CS 


PRT 


T» 1 iTT O T/ 

pBluescnptll SK- 


391 


145-52-2-0-D12-CS 


PRT 


^ pBluescnptll SK- 


392 


145-7 -2-0-G5-CS 


PRT 


T~i 1 . . " iTT CT/ 

pBluescnptll SK- 


393 


145-7-3-0-D3-CS 


PRT 


"Til — • _ i_TT O T/ 

pBluescnptll SK- 


394 


157-17-2-0-C1-CS 


PRT 


"Til — iTT O T/ 

pBluescnptll SK- 


395 


1 /<" f\ 1 /\ 1 O /\ TT^ 

1 60- 101 -3-0-H2-CS 


PRT 


"P> 1 ~ " ^_ i.TT CT/ 

pBluescnptll SK- 


396 


1 60- 12-1 -0-D 1 0-CS 


PRT 


„ "Tl 1 . . ' iTT CI/ 

pBluescnptll SK- 


397 


1 HC\ O O /I A A /^C 

1 60-28-4-0-C4-CS 


FR1 


pBluescnptll SK- ; 


398 


160-31 -3 -0-b4-CS 


PR1 


T*l 1 . . ~ — — !_i.TT CT/ 

pBluescnptll SK- 1 


399 


1 60-40- 1 -0-H4-CS 


PRT 


_ T^ 1 ^ 'iTT C^T^ 

pBluescnptll SK- i 


400 


1 60-54- 1-0-F7-CS 


PRT 


._ t~> 1 ' iTT c T/ 

pBluescnptll SK- 


401 


1 S~ f\ OO O /\ AO /^O 

160-88-3 -0-A8-CS.cor 


PRT 


— T"^ 1 _ _ ^ i TT T^" 

pBluescnptll SK- • 


402 


1 60-88-3-0-A8-CS.tr 


PRT 


Ti1. " iTT C T/ 

pBluescnptll SK- 


403 


1 f f\ C\C\ A f\ T7 A f~~*Ci 

1 60-99-4-0-b4-CS 


PR1 


T^l 1 . ■ v TT CT/ 

pBluescnptll SK- ] 


404 


161-5-4-0-B6-CS 


rRl 


pBluescnptll SK- 


40 5 


1 74- 17-1 -0-D6-CS 


PR1 


pPT 


406 


1 74-32-4-0-F8-CS 


FR1 


pPT 


407 


174-38-4-0-D1 1-CS 


PRT 


pPT 


4Uo 


i7ji q o n nn r^c 
1 /4-a-z-U-L- 1 U-L-o 


DDT 

rKl 


pPT 


409 


1 79-1 4-2 -0-F1 1-CS 


PRT 


pBluescnptll SK- 


410 


179_9^-0-B8-CS 


PRT 


pBluescriptll SK- j 


411 


181 -10-1 -0-C9-CS 


PRT 


pBluescnptll SK- 


412 


187-5-3-0-C7-CS 


PRT 


pBluescriptll SK- 
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413 


1 OC O/C A A EC f^O 




pBluescnptll SK- 


A 1 A 

414 


1 OO T7 O A 1 Z^C 

1 88-2 /-3-0-O 1 -Li 


rKl 


pBluescnptll SK- 


/I 1 C 

41!) 


i oo on o a ui /^c 
1 88-zy-z-U-ri 1 -to 


"DOT" 


pBluescnptll SK- 


416 


i oo ii in r^c 
1 88-3 1-1 -U-.bo-L,£> 


DDT" 

rKl 


pBluescnptll SK- 


417 


i oo /I c 1 A ni /^c 
1 88-45- 1-0-JJ3 -to 


rKl 


pBluescnptll SK- 


418 


1 OO C 1 A TJ£. i^Q 

1 88-5-1 -0-rio-CS 


"DO T 1 

rKl 


pBluescnptll SJs.- 


419 


i oo r\ i A i a z~^c 

1 88-9-1-0-ClO-CS 


FR1 


pBluescnptll SK- 


420 


1 f\E f\ 1 /"OA /~~*C Z^C 

105-01 6-3 -0-C5 -CS 


FR1 


pBluescnptll SK- 


421 


1 AC zi^z - A A T*\(\ Z^C 

105-026-4-0-D9-CS 


FR1 


pBluescnptll SK- 


422 


i at An ^ n T"\r\ z^o 

1 0 5 -0 5 3 -2 -0-D9 -CS 


FKl 


pBluescnptll SK- 


423 


1 C AZ"A O /~v All Z^O 

1 05-069-3-0-A1 1 -CS 


FR1 


pBluescnptll SK- 


424 


1 AC C\H £L A A T7 Z^O 

1 05-0 76-4-0-F6-CS 


rKl 


pBluescnptll SK- 


/i ^ c 

425 


1 AC 1 O C 'i A T7A Z^C 

105 -135-2 -0-F 9-CS 


FKl 


pBluescnptll SK- 


/I ^> Z 

426 


1 AZ" A^>0 A A T7Z" Z^C 

1 06-023-4-0-F6-CS 


FR1 


pBluescnptll SK- 


427 


1 1 A A A 1 *5 A 1 1 Z^C 

1 1 0-00 1 -3-0-C 1 1 -CS 


FR1 


pBluescnptll SK- 


428 


1 10-002-3-0-F9-CS 


FR1 


pBluescnptll SK- 


429 


11/1 A1A 1 A Tin /^C 

1 14-019-3-0-D9-CS 


FKl 


pBluescnptll SK- 


430 


11/1 AOA 1 A Z^Z" Z^C 

1 14-029-1 -0-C6-CS 


FKl 


1 _ A ' iTT CT/ 

pBluescnptll SK- \ 


43 1 


11/1 ATI /» f\ T"» 1 Z^C 

1 1 4-032-4-0-B 1 -CS 


FR1 


pBluescnptll SK- 


432 


11/1 AHA ^ Z\ TT/1 Z~"* O 

1 14-070-2 -0-H4-CS 


FR1 


T1 1 _ _ _ _ " i.TT fiT/ 

pBluescnptll SK- 


433 


1 1 Z" Zk 1 ZT O Zl T711 Z^C 

1 1 6-0 1 6-3 -0-F 1 1 -CS 


FR1 


pBluescnptll SK- 


434 


1 1 Z" ZVOO A f\ Z"" 1 '"I Z~*C 

1 16-022-4-0-G2-CS 


PRT 


"Til " y TT CT/ 

pBluescnptll SK- 


435 


1 1 Z" A/T^ ^ A TTO Z~"»0 

1 16-052-2-0-H8-CS 


PR1 


T*> 1 '- i.TT O TZ 

pBluescnptll SK- 


436 


1 1 ZT AC"} /I A "D A Z^C 

1 16-053-4-0-B4-CS 


FR1 


„ I ) 1 , „ . „ ' 4 T I O TZ 

pBluescnptll SK- 


/in 

437 


1 1 Z" AA/1 o n TT1 Z^C 

1 16-094-3-0-H2-CS 


T">T"> "T 1 

FR1 


pBluescnptll SK- I 


/I 1 o 

438 


1 1 ZT 1 1 O /I A Z~~""7 Z^C 

1 16-1 12-4-0-C7-CS 


FR1 


T)1 ' , i 1 1 CT/ 

pBluescnptll SK- 


439 


1 1 Z" 1T> o r\ T?1'"> z~<c 

1 16-123-3-0-F12-CS 


FR1 


T^\ "I , " y TT CT/ 

pBluescnptll SK- 


440 


1 ^ o zvzvo 1 /\ Z^C Z^C 

123-008- 1-0 -C5-CS 


FR1 


T~> " iTT C TZ 

pBluescnptll SK- 


A A 1 

44 1 


1 /If CO ^ A TTO Z"*C 

145-5 3-2 -0-H8-CS 


PRT 


T~~i 1 !„i.TT C TZ 

pBluescnptll SK- 


442 


1 AC f q >■> /\ Z^A Z^C — 

145-57-2 -0-C9-CS. cor 


FR1 


T> 1 ■ iTT cv 

pBluescnptll SK- 


/I /I O 

443 


145-57-2 -0-C9-CS. Ir 


PR1 


T"> 1 A - ' j TT c TZ 

pBluescnptll SK- 


AAA 

444 


1 A C *7 O A T"> 1 ^> Z"^C 

145-7-3-0-B12-CS 


FR1 


. T~~\ 1 ^ ' ^. T T C TZ 

pBluescnptll SK- 


vl /I C 

445 


1 C ^7 1 ^ /\ T"\ 1 Z""' O 

157-12-2-0-D1-CS 


FR1 


1)1. , ' * T I CT/ 

pBluescnptll SK- 


A A ^ 

446 


i n i zr ^ a t~\ c z-'c 

157-16-2-0-D5-CS 


PR1 


„ T~> 1, , ' i_TT C TZ 

pBluescnptll SK- :j 


447 


1 C7 1 O ^ A A "7 /^O 

157-1 8-2-0-A7-CS 


FKl 


pBluescnptll SK- 


A AO 

448 


1 Z:A 1 A1 1 A D1A /^C 

1 60- 103-1 -0-B 1 0-CS 


r>r> t 1 
FKl 


pBluescnptll SK- 


449 


i z:a i A/i a a n /^c 
1 60- 1 04-4-0 -F 3 -CS 


FKl 


..DIimnnm^TT CT/ 

pBluescnptll SK- 


450 


i /A "0^> A TM A z~^o 

160-22-2-0-D10-CS 


FKl 


T> 1 , . ^ „ J -4-TT C TZ 

pBluescnptll SK- 


A C A 

451 


1 /A ^ A O /\ TTIO Z~>0 

1 60-24-3 -0-F 1 2-CS 


FR1 


T~\ 1 . ' i.TT C TZ 1 

pBluescnptll SK- 


/l c^ 

452 


1 ZTA O ^ A TTO Z^O 

160-3-2-0-H3-CS 


FR1 


„ 1 ^ 1 , ^ , , ^ J „ 4 T I C TZ 

pBluescnptll SK- 


453 


1 /A CO O A AO Z^ C 

1 60-5 8-2 -0-A2-CS 


T>T> T* 

FR1 


~«.T*)1 , ^nrt-J— +TT CT/ 

pBluescnptll SK- 


AC A 

454 


1 /CA T5 1 A "D A /^C 

160-/3-1-0 -B4-C S 


r>r> t 1 
FKl 


«D1imnA«v>iTT CT/ 

pBluescnptll SK- 


/ICC 

455 


1 ZTA 1 C /I A TTZ: Z^C 

1 60- /5 -4-U-F,6-CS 


FKl 


«D1nar.n««iTT CT/ 

pBluescnptll SK- 


456 


160-y /O-U-bV-Lo 


DOT 1 

FKl 


«D1nar>Am«iTT CT/ 

pBluescnptll sk- 


A C7 

457 


1 T/1 1 /I A "CO /^C 

1 74- 1 -4-0-F,9-CS 


FKl 


pPT 


/ICO 

458 


1 HA 1 1 /I A Z^C 

1 74- 1 z-4-O-Cz-CS 


FKl 


PPI 


A C C\ 

459 


1 OA 1 A /I A TTO Z~~»0 

1 80- 1 9-4-0-H2-CS 


FKl 


oDInnnAMMiTT CT/ 

pBluescnptll SK- 




ici in /t n nn r^c 

lo l-lU-4-U-ulZ-Lo 


DDT 

rKi 


pr>iuescnptii ok- 


1 461 


181-3-2-0-F6-CS 


PRT 


pBluescriptll SK- 


i 462 


181-4-4-0-A12-CS 


PRT 


pBluescriptll SK- 


! 463 


181-9-2-0-F12-CS.cor 


PRT 


pBluescriptll SK- 


464 


181-9-2-0-F12-CS.fr 


PRT 


| pBluescnptll SK- | 
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A A^ 




"DDT 1 
r K 1 


pr>lue scrip til bK- | 


AAA 


1 Si A A 7 ft Fk^ f^Q 


PT?T 
rK 1 


prJiuescriptii oJv- 


A A 7 


1 8/1 7 1ft T77 f~*Q 


DDT 


prsiuescriptii oJv- 


A AS 


i sa s a ft r^Q r^Q 


rK 1 


prJiuescriptii ok- 


A AO 

j 4oy 


ifi7 i ft ^ ft r^o r^Q 


rK 1 


prsiuescriptii oiv- 


A7ft 


1 87 10 ft ft r*-»7ft r"^Q 


rK 1 


prsiuescriptii oJv- 


AH ~\ 
4/1 


i o h n n ft *^7 i r^c ^ >< ^.. 
l o /-Jz-u-u-nz i -Co. cor 


rKl 


prsiuescriptii oK- 


i 4/z 


187 'JO ft ft 1 r^C A- 

l o /-3z-U-U-nzA-Co.ir 


rK 1 


prsiuescriptii oK- 


4/3 


1 8*7 /17ft T7A f~^C 

1 o /-4-z-U-iiO-LxO 


rKl 


prsiuescriptii oK- 


474 


1 C7 >tft ft ft il ^ /^C 

1 o /-4U-U-U-1 1 j-Lo 


dd T 1 
rK 1 


prsiuescriptii oK- 


475 


1 Q*7 /IT ft ft rf^O 

1 o /-4 /-U-U-gz4-Uo 


dd "T 
rK 1 


prsiuescriptii oK- 


/1"7A 

i 4 /O 


1C7 ft 1 ft A 7 f^C 

lo /-V-j-U-Az-Lo 


DD T 

rK 1 


prsiuescriptii ok- » 


4/ / 


i oo i/: /i ft xji r^c 
1 oo-zo-4-U-ri 1 -Lo 


DD T 

rK 1 


prsiuescriptii oK- 


*T / O 


1 00"JJ*J*v VJ7 V/O 


IT XV 1 




! 479 


188-38-4-0-D8-CS 


PRT 


pBluescriptll SK- 


480 


188-41-1-0-E6-CS 


PRT 


pBluescriptll SK- | 


481 


188-42-2-0-F3-CS.cor 


PRT 


pBluescriptll SK- 


482 


188-42-2-0-F3-CS.fr 


PRT 


pBluescriptll SK- : 
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Table II 



Seq Id No 


Full coding 
sequence 


Signal 
sequence 


Coding sequence 
for mature 

n rot pin 

III UIC1U 


Polyadenylation 
signal 


Polyadenylation 
site 


i 
i 


ri6Q 1 6Q91 

[ i oy-i oyzj 


T1 6Q 9401 

L i oy-z*+y j 


T9^0 1 6Q91 

[zdu-i oyzj 


T9 1 96 91111 
[Z 1ZO-Z 1 D 1 J 


T9 1 ^9 9901 1 
[Z 1 JZ-ZZO 1 J 


z 


r i as 1 1 Am 

[1^4o-l 1^4UJ 


[ l^o-Z^rUJ 


T9A1 1 1AH1 
[Z^4 1-1 1^+UJ 


T1 ^Q9 1 ^Q71 

[ i Dyz- i Dy / j 


n 61 ^ 1 61 1 1 

[ 1 O 1 D-l OD 1 J 


D 


roc QH61 
[oD-yUOj 


roc nci 
[oD- 1DDJ 


n 16 QH61 


ri i CO 11 641 

[i i Dy- 1 1 o^+j 


n 1 84 1 94^1 
[11 oH- 1Z^4DJ 


A 
H 


rn 1 9 a si 

[D 1 - IZ^foJ 


f3 1 11C1 

[Dl-lDDJ 


r 1 16 1 9A81 
[ 1 DO-lZ^oJ 


in one aetecxea 


T1 607 1 6911 
[ 1 OU / -I OZDJ 


C 
D 


T79 1 All 


T79 1 1 Ql 

[ /z- 1 1 yj 


T190 1A11 
[ 1ZU- 1*40J 


n 41 6 1 49 1 1 
L i i o- i*+z i j 


T1418 14S41 


6 
O 


nil ii ^.ai 

[111-11 D^4J 


nil i Q7i 
L 1 1 i-iy /j 


T 1 Q8 1 1 ^A1 
[ 1 yo-i 1 D^4J 


T1609 1 6071 

L i ouz-i ou / j 


T1691 1 61Q1 
[ 1 OZD- 1 ODyj 


7 

/ 


r66 i9<\6i 

[DO- 1ZDOJ 


r66 1 711 
[OO-l /DJ 


r 1 74 1 9^61 
[ 1 /^t-lZDOJ 


in one aeiecLea 


ri7C9 17681 
[ 1 / DZ-1 /OoJ 


o 

i " 


r i on i iqsi 
[ iyu-l DyoJ 


n on 9^91 
[ iyu-zDzj 


i T9^1 1 1Q81 
[ZDJ-ljyoJ 


T1470 1A7^\l 
[ l^f /\J-lH /DJ 


r i AQA Kim 
[ LHyH- 1 D 1UJ 


y 


r7fi_A1 m 
[ / 0-^4 1UJ 


T7fi 1 ^1 
[ /o- 1 D D J 


n ^6_ai m 

[ 1 D O-^t 1 UJ 


in one aeieciea 


T866 8891 
[oOO-ooZJ 


i n 
1 u 




rsA i iai 

[Cv4- 1 D^tJ 


n 1^ 9QQ1 

[ i >?D-zyyj 


r i fi i a 1 fi i qi 
[loi^f-ioiyj 


T1 fill 1 fiAQl 
[ 1 ojj- 1 o^+yj 


1 1 
1 1 


rcc A6fil 
[DD-^fOoJ 


rcc QQI 


n nn A68i 


rcil C161 
[DD 1 -DDOJ 


rcAQ ^6^1 
[D^+y-DODJ 


1 9 
iz 


r 1 CO Al^l 

L 1 DZ-4 /DJ 


T1 ^9 9AA1 
[ 1 DZ-Z^4^4J 


T9A^ A7^1 
[Z^fD-^f /DJ 


T1691 169fil 
[ 1 OZD-1 OZoJ 


T16A7 16611 
[ 1 0^4 /-I OODJ 


i i 

ID 


r i 1 9 ^ ^9i 
L 1 1Z-DDZJ 


n 1 9 i cii 
[1 IZ-loJJ 


T 1 8A ^ ^91 


T706 7111 
[ /UO- / 1 1 J 


T79Q 7AA1 

[ /zy-/4-^fj 


1 A 


rim 1 9Aii 

[ 1U1 - IZ^-DJ 


nni 1 QQi 
[lui-iyyj 


r9nn 1 9Aii 

[ZUU-lZ^f 3} 


T 1 790 1 79^1 
[ 1 /ZU- 1 /ZDJ 


T1 7A^ 1 7^Q1 
[1 /^fD-1 /Dyj 


1 C 

ID 


r i a i c 1 71 
1 1U 1-D 1 /J 


nni 1 QQi 
[ iu i-iyyj 


r9nn ^ 1 7i 

[ZUUO 1 /j 


r 1 7 1 6 1 79 1 1 
[1/lO-l/ZlJ 


T17A1 17^^1 
[1 /^fl-1 /DDJ 


1 6 

lo 


[Dy-oD DJ 


rco i Am 

[oy-iuuj 


n ai e^ii 

L 1U 1-oD d\ 


T8QA 8QQ1 

[oV4- ©yyj 


TQ99 Q161 

[yzz-yjoj 


1 7 
1 / 


r71 A791 
[ / D-O /ZJ 


[ / J-l DZJ 


ri 'X'X 6791 
[ 1 jj-O /ZJ 


T68Q 6QA1 

[Ooy-oy^j 


T71 1 7A71 
[ / 1 1- IH /J 


1 8 
15 


rQA 1 97^1 

[y*fr- iz /dj 


tqa 9 1 m 
[y*f- z i u j 


T9 1 1 1 97^1 
[Z 1 1-1Z /DJ 


T1 8AO 1 fi^Al 

[ i o^y-i oD^fj 


T1 870 1 88A1 
[15 /U-l 554 J 


1 Q 


M9 ^1^1 
[4Z-D 1DJ 


TA9 Q91 

L^z-yzj 


roi ^ 1 ^i 
[y^o idj 


T6AQ 6^A1 

Lo^fy-oD^fj 


T677 6Q 1 1 

[o / /-oy i j 


90 
zu 


T971 Q6Q1 

[z / i-yoyj 


T971 1661 
[Z / 1-DOOJ 


T167 Q6Q1 

[jo /-yoyj 


NOQ1 10Q81 

[ iuyD-iuyoj 


T1 1 9A 1 1 181 
[ 1 IZ^-l 1 Doj 


9 1 

Z 1 


r76 9761 
[ / O-Z / OJ 


T76 1 1^1 
[ /O-l DJ] 


T1 16 9761 
[ 1 jO-Z / OJ 


TA16_AA1 1 


rAcc_4681 
L^DD-^+OoJ 


99 
ZZ 


r6 9871 
[O-Zo /J 


r6 fim 


T81 9871 
[ o 1 -Z o / J 


T68A 68Q1 

[Oo^- ooyj 


T706 7901 
[ /UO-/ZUJ 


1 91 
Z J 


r 1 7 1 6Q91 

[i/i -oyzj 


T171 9971 
[1/1 -ZZ / J 


T998 6Q91 

[zzo-oyzj 


T6Q1 6Q61 

[oy i -oyoj 


\1 1 1 7971 
[ / ID- / Z / J 


9/1 
Z*+ 


ri in azai 
L ID / -*tD*+J 


r 1 n 1 87i 

[1D/-10/J 


T1 88_4^41 


T440_44^1 


r4S6_4701 
L*+DO-^4 / UJ 


9^ 
ZD 


r918 6ooi 
[ZDo-ouyj 


T918 9Q1 1 
[Zjo-zy i j 


T9Q9 60Q1 
[zyz-ouyj 


rQ48-Q^11 


TQ71-Q871 


96 

zo 


rso 8691 

[OU-OOZJ 


T80, 1 971 

[OU-1Z / J 


T1 98 8691 

[ 1 ZO-OOZJ 


T87^ 8801 

[O / J-OOUJ 


T8Q4 Q081 
[oy*+-yuoj 


97 
Z / 


tot o i m 
[oD-D 1UJ 


T81 1 ^71 


n ^8 11 m 
L i jo-j iuj 


T79^ 7101 
[ /ZD- / J\J] 


T748 7691 
[ /^4o- /OZJ 


98 
Zo 


n 1 n oo6i 
[D i u-yuoj 


[D lU-jj / J 


T1^8 Q061 

[DDo-yuoj 


T1071 10761 
[lU/l-lU / o J 


n 088 1 1 091 
L 1 UoO- 1 1 ozj 


9Q 


r9A 9871 
[zh-zo / J 


T9A mi 


n 19 9871 
[ 1 jZ-ZO / J 


r40<\ 41 01 
IZ+UD-H 1UJ 


T499 4161 
[HZZ-H «?OJ 


io 


T1 19-1 ^741 


T1 19-9061 


T907-1 ^741 
[ZU / -1 /"J 


T1 8QQ-1 Q041 
L i oyy- 1 yuHj 


T1 091-1 Q181 


1 1 

D 1 


n 1 7 ^.4^1 

[11/ -DHJJ 


T1 1 7-9AS1 

[11/ -ZHJ J 


T9A6-S4S1 


in one uclcclcu 


T1 1 00-1 1 1 61 
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1 36 


[208-1239] 


[208-294] 


[295-1239] 


None detected 


[1307-1323] 


37 


[60-1682] 


[60-143] 


[144-1682] 


None detected 


[1929-1945] 


38 


[198-998] 


[198-269] 


[270-998] 


[1292-1297] 


[1315-1330] 


39 


[505-1590] 


[505-624] 


[625-1590] 


[2089-2094] 


[2108-2124] 


40 


[84-326] 


[84-146] 


[147-326] 


[1122-1127] 


[1142-1159] 


41 


[56-1678] 


[56-139] 


[140-1678] 


None detected 


[1936-1953] 
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42 


[119-1522] 


[119-181] 


[182-1522] 


None detected 


[1671-1688] 


43 


[334-1551] 


[334-426] 


[427-1551] 


None detected 


[1925-1942] 


44 


[72-986] 


[72-149] 


[150-986] 


[1608-1613] 


[1640-1657] 


45 


[157-1482] 


[157-219] 


[220-1482] 


None detected 


[1716-1733] 


46 


[195-1052] 


[195-338] 


[339-1052] 


None detected 


[1854-1871] 


47 


[217-1410] 


[217-279] 


[280-1410] 


[1482-1487] 


[1507-1536] 


48 


[103-492] 


[103-162] 


[163-492] 


[794-799] 


[815-832] j 


49 


[234-491] 


[234-293] 


[294-491] 


[793-798] 


[814-831] 


50 


[180-800] 


[180-248] 


[249-800] 


[880-885] 


[901-917] 


51 


[140-472] 


[140-211] 


[212-472] 


None detected 


[605-621] 


52 


[68-484] 


[68-112] 


[113-484] 


None detected 


[657-673] 


53 


[38-517] 


[38-118] 


[119-517] 


[861-866] 


[885-897] 


54 


[92-634] 


[92-139] 


[140-634] 


None detected 


None detected 


55 


[27-767] 


[27-80] 


[81-767] 


None detected 


[1031-1047] 


56 


[4-399] 


[4-126] 


[127-399] 


[891-896] 


[909-923] 


57 


[127-879] 


[127-198] 


[199-879] 


None detected 


[1224-1240] 


58 


[156-566] 


[156-221] 


[222-566] 


[870-875] 


[888-902] 


59 


[35-1657] 


[35-118] 


[119-1657] 


None detected 


[1955-1969] 


60 


[77-937] 


[77-127] 


[128-937] 


[1098-1103] 


[1116-1132] 


61 


[9-503] 


[9-113] 


[114-503] 


[594-599] 


[615-631] 


62 


[21-464] 


[21-95] 


[96-464] 


[650-655] 


[692-722] 


63 


[178-1050] 


[178-279] 


[280-1050] 


[1400-1405] 


[1426-1442] 


I 64 


[32-274] 


[32-178] 


[179-274] 


[756-761] 


[779-795] 


65 


[222-920] 


[222-311] 


[312-920] 


[1191-1196] 


[1220-1236] 


66 


[101-355] 


[101-160] 


[161-355] 


[772-777] 


[788-881] 


67 


[173-487] 


[173-301] 


[302-487] 


[486-491] 


[508-524] 


68 


[210-1082] 


[210-311] 


[312-1082] 


[1432-1437] 


[1456-1472] 


69 


[172-1449] 


[172-255] 


[256-1449] 


None detected 


[1721-1737] 


70 


[30-1427] 


[30-77] 


[78-1427] 


[1594-1599] 


[1621-1637] 


71 


[30-1175] 


[30-77] 


[78-1175] 


[1593-1598] 


[1620-1636] 


72 


[66-839] 


[66-173] 


[174-839] 


None detected 


[1742-1758] 


73 


[64-903] 


[64-162] 


[163-903] 


[1612-1617] 


[1631-1647] 


74 


[64-585] 


[64-162] 


[163-585] 


[1611-1616] 


[1630-1646] 


75 


[274-753] 


[274-324] 


[325-753] 


[1931-1936] 


[1947-1963] 


76 


[191-1468] 


[191-274] 


[275-1468] 


None detected 


[1741-1757] 


77 


[48-950] 


[48-107] 


[108-950] 


[1983-1988] 


[2011-2027] j 


78 


[156-512] 


[156-206] 


[207-512] 


[1831-1836] 


[1864-1880] 


79 


[67-351] 


[67-183] 


[184-351] 


None detected 


[568-584] 


80 


[259-831] 


[259-375] 


[376-831] 


None detected 


[1337-1351] 


81 


[111-377] 


[111-233] 


[234-377] 


[689-694] 


[706-720] 


82 


[223-432] 


[223-336] 


[337-432] 


[986-991] 


[1015-1029] 


83 


[769-1272] 


[769-843] 


[844-1272] 


None detected 


[1774-1788] j 


84 


[30-527] 


[30-74] 


[75-527] 


[738-743] 


[756-805] | 


85 


[39-506] 


[39-83] 


[84-506] 


None detected 


[800-814] 


86 


[115-429] 


[115-210] 


[211-429] 


[565-570] 


[584-598] 


87 


[332-574] 


[332-412] 


[413-574] 


None detected 


[630-699] j 
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88 


[133-417] 


[133-213] 


[214-417] 


[876-881] 


[891-905] 


89 


[113-364] 


[113-172] 


[173-364] 


None detected j 


[500-514] 


90 


[9-380] 


[9-104] 


[105-380] 


[483-488] 


[504-518] 


91 1 


[155-340] 


[155-292] 


[293-340] 


[728-733] 


[754-808] 


92 


[185-634] 


[185-253] 


[254-634] 


[704-709] 


[723-737] 


93 


[53-646] 


[53-91] 


[92-646] 


[694-699] 


[714-728] 


| 94 


[247-510] 


[247-318] 


[319-510] 


[544.549] 


[568-582] 


95 


[143-592] 


[143-277] 


[278-592] 


[1877-1882] 


[1898-1913] 


96 


[33-458] 


[33-89] 


[90-458] 


[637-642] 


[654-670] 


1 97 


[1-336] 


[1-81] 


[82-336] 


[900-905] 


[923-939] 


1 98 


[174-443] 


[174-269] 


[270-443] 


[629-634] 


[647-661] 


99 


[282-521] 


[282-386] 


[387-521] 


[600-605] 


[631-647] 


100 


[251-643] 


[251-295] 


[296-643] 


None detected 


[990-1006] 


101 


[179-475] 


[179-295] 


[296-475] 


[995-1000] 


[1015-1059] 


102 


[34-327] 


[34-162] 


[163-327] 


[466-471] 


[498-514] 


103 


[303-953] 


[303-359] 


[360-953] 


[1124-1129] 


[1142-1158] 


104 


[97-645] 


[97-156] 


[157-645] 


[1524-1529] 


[1547-1563] 


105 


[80-820] 


[80-118] 


[119-820] 


[1587-1592] 


[1606-1621] 


106 


[77-388] 


[77-217] 


[218-388] 


[524-529] 


[541-557] 


107 


[139-513] 


[139-201] 


[202-513] 


[566-571] 


[584-600] 


108 


[81-986] 


[81-134] 


[135-986] 


[1092-1097] 


[1113-1129] 


109 


[266-586] 


[266-307] 


[308-586] 


[745-750] 


[762-778] 


110 


[59-745] 


[59-160] 


[161-745] 


None detected 


[1285-1301] 


111 


[59-676] 


[59-160] 


[161-676] 


None detected 


[1284-1300] 


112 


[15-278] 


[15-146] 


[147-278] 


[1580-1585] 


[1600-1617] 




[167-619] 


[167-262] 


[263-619] 


[1598-1603] 


[1617-1634] j 


114 


[223-417] 


[223-270] 


[271-417] 


[655-660] 


[677-693] 


115 


[166-732] 


[166-237] 


[238-732] 


[753-758] 


[768-784] 


116 


[75-623] 


[75-215] 


[216-623] 


[767-772] 


[788-804] 


117 


[30-335] 


[30-71] 


[72-335] 


[450-455] 


[468-484] 


118 


[21-752] 


[21-107] 


[108-752] 


None detected 


[970-985] | 


119 


[185-715] 


[185-253] 


[254-715] 


[785-790] 


[814-839] 


120 


[54-527] 


[54-116] 


[117-527] 


[545-550] 


[567-583] 


121 


[129-686] 


[129-185] 


[186-686] 


[989-994] 


[1008-1024] 


122 


[165-614] 


[165-305] 


[306-614] 


[719-724] 


[744-760] 


123 


[192-476] 


[192-326] 


[327-476] 


[555-560] 


[578-594] 


124 


[16-297] 


[16-93] 


[94-297] 


None detected 


[543-559] 


125 


[216-635] 


[216-335] 


[336-635] 


[717-722] 


[728-744] 


126 


[164-280] 


[164-268] 


[269-280] 


i [789-794] 


[809-824] ! 


127 


[68-301] 


[68-190] 


[191-301] 


[485-490] 


[510-526] j 


128 


[179-427] 


[179-298] 


[299-427] 


! [579-584] 


[602-618] 


129 


[22-297] 


[22-66] 


[67-297] 


[742-747] 


[760-776] 


130 


[9-845] 


[9-134] 


[135-845] 


; [964-969] 


[983-998] 


; 131 


[27-578] 


[27-119] 


[120-578] 


; [742-747] 


[763-779] 


; 132 


[408-710] 


[408-533] 


[534-710] 


[985-990] 


[1009-1025] 


: 133 


[247-501] 


[247-306] 


[307-501] 


\ None detected 


[592-607] 
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134 


[333-602] 


[333-416] 


[417-602] 


None detected 


[761-774] 


1 135 


[110-376] 


[110-208] 


[209-376] 


[582-587] 


[601-611] 


136 


[22-417] 


[22-66] 


[67-417] 


[888-893] 


[909-925] 


! 137 


[62-367] 


[62-103] 


[104-367] 


[638-643] 


[658-674] 


j 138 


[107-1618] 


[107-178] 


[179-1618] 


[1688-1693] 


[1709-1725] 


j 139 


[16-471] 


[16-93] 


[94-471] 


None detected 


[1458-1474] 


140 


[222-374] 


[222-299] 


[300-374] 


None detected 


[637-653] 


141 


[59-274] 


[59-127] 


[128-274] 


[1452-1457] 


[1474-1490] 


142 


[158-442] 


[158-301] 


[302-442] 


[621-626] 


[645-661] 


143 


[5-454] 


[5-64] 


[65-454] 


[1745-1750] 


[1773-1789] 


144 


[241-1302] 


none detected 


[241-1302] 


[1968-1973] 


[1990-2006] 


145 


[15-635] 


none detected 


[15-635] 


[1057-1062] 


[1080-1096] 


146 


[109-738] 


none detected 


[109-738] 


[1633-1638] 


[1650-1666] 


147 


[21-1145] 


none detected 


[21-1145] 


[1648-1653] 


[1666-1687] 


148 


[70-1596] 


none detected 


[70-1596] 


[1712-1717] 


[1733-1747] 


149 


[129-362] 


none detected 


[129-362] 


[597-602] 


[626-658] 


j 150 


[109-594] 


none detected 


[109-594] 


[1999-2004] 


[2029-2045] 


151 


[150-587] 


none detected 


[150-587] 


None detected 


[772-788] 


152 


[173-847] 


none detected 


[173-847] 


[1894-1899] 


[1915-1931] 


153 


[100-441] 


none detected 


[100-441] 


[479-484] 


[500-514] 


154 


[32-1132] 


none detected 


[32-1132] 


None detected 


[1167-1183] ( 


155 


[160-996] 


none detected 


[160-996] 


[1504-1509] 


[1529-1545] 


156 


[11-529] 


none detected 


[11-529] 


[1042-1047] 


[1053-1068] 


157 


[135-749] 


none detected 


[135-749] 


[1055-1060] 


[1081-1097] j 


158 


[98-637] 


none detected 


[98-637] 


[862-867] 


[878-894] 


159 


[221-670] 


none detected 


[221-670] 


[669-674] 


[688-703] 


160 


[165-674] 


none detected 


[165-674] 


[808-813] 


[833-849] j 


161 


[165-671] 


none detected 


[165-671] 


[805-810] 


[830-846] 


162 


[28-1128] 


none detected 


[28-1128] 


[1121-1126] 


[1159-1176] 


163 


[135-194] 


none detected 


[135-194] 


[1050-1055] 


[1068-1084] 


164 


[173-847] 


none detected 


[173-847] 


[1757-1762] 


[1776-1793] 


165 


[8-1141] 


none detected 


[8-1141] 


None detected 


[1832-1849] 


166 


[136-264] 


none detected 


[136-264] 


[1720-1725] 


[1731-1748] 


167 


[14-1048] 


none detected 


[14-1048] 


[1234-1239] 


[1258-1275] 


168 


[70-777] 


none detected 


[70-777] 


[987-992] 


[1007-1023] 


169 


[38-400] 


none detected 


[38-400] 


[1043-1048] 


[1069-1085] 


170 


[63-572] 


none detected 


[63-572] 


[750-755] 


[767-776] 


171 


[160-867] 


none detected 


[160-867] 


[1178-1183] 


[1203-1219] 


172 


[68-640] 


none detected 


[68-640] 


None detected 


[1471-1487] 


173 


[132-1298] 


none detected 


[132-1298] 


[1873-1878] 


[1899-1915] 


i 174 


[259-1701] 


none detected 


[259-1701] 


None detected 


[1974-1990] 


175 


[213-1274] 


none detected 


[213-1274] 


[1940-1945] 


[1955-1971] 


j 176 


[68-127] 


none detected 


[68-127] 


None detected 


[1597-1613] 


1 177 


[65-1024] 


none detected 


[65-1024] 


[1291-1296] 


[1315-1361] 


178 


[109-585] 


none detected 


[109-585] 


[1059-1064] 


[1082-1113] 


179 


[29-577] 


none detected 


[29-577] 


[1917-1922] 


[1944-1960] 
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180 


[23-451] 


none detected 


[23-451] 


[1405-1410] 


[1427-1443] 


181 


[232-450] 


none detected 


[232-450] 


None detected 


[589-605] 


182 


[758-1183] 


none detected 


[758-1183] 


None detected 


[1708-1724] 


183 


[486-932] 


none detected 


[486-932] 


None detected 


[1670-1686] 


184 


[80-304] 


none detected 


[80-304] 


None detected 


[452-463] 


185 


[188-691] 


none detected 


[188-691] 


[707-712] 


[727-773] 


186 


[94-573] 


none detected 


[94-573] 


None detected 


[739-753] 


187 


[181-462] 


none detected 


[181-462] 


None detected 


[740-754] 


188 


[6-290] 


none detected 


[6-290] 


None detected 


[971-998] 


189 


[115-411] 


none detected 


[115-411] j 


[573-578] 


[591-605] 


190 


[3-368] 


none detected 


[3-368] 


[481-486] 


[511-526] 


191 


[174-527] 


none detected 


[174-527] 


[878-883] 


[896-910] 


192 


[57-203] 


none detected 


[57-203] 


[579-584] 


[599-668] 


193 


[68-334] 


none detected 


[68-334] 


[562-567] 


[583-637] 


194 


[183-443] 


none detected 


[183-443] 


[670-675] 


[692-706] 


195 


[94-228] 


none detected 


[94-228] 


None detected 


[656-670] 


196 


[133-327] 


none detected 


[133-327] 


[465-470] 


[496-510] 


j 197 


[22-357] 


none detected 


[22-357] 


None detected 


[486-500] 


198 


[4-333] 


none detected 


[4-333] 


[633-638] 


[653-667] 


199 


[1-363] 


none detected 


[1-363] 


[474-479] 


[498-514] 


| 200 


[41-337] 


none detected 


[41-337] 


None detected 


[401-462] 


201 


[1-551] 


none detected 


[1-551] 


None detected 


[535-551] 


202 


[34-315] 


none detected 


[34-315] 


None detected 


[534-550] 


203 


[1-315] 


none detected 


[1-315] 


[371-376] 


[392-408] 


204 


[94-582] 


none detected 


[94-582] 


None detected 


[651-665] 


j 205 


[540-923] 


none detected 


[540-923] 


None detected 


[994-1008] 


j 206 


[77-364] 


none detected 


[77-364] 


[367-372] 


[391-455] 


207 


[65-544] 


none detected 


[65-544] 


[710-715] 


[733-749] 


208 


[117-467] 


none detected 


[117-467] 


[557-562] 


[578-594] 


| 209 


[893-1897] 


none detected 


[893-1897] 


[2066-2071] 


[2082-2098] 


; 210 


[85-342] 


none detected 


[85-342] 


None detected 


[412-428] 


211 


[155-433] 


none detected 


[155-433] 


[713-718] 


[735-769] 


212 


[63-386] 


none detected 


[63-386] 


[878-883] 


[898-914] 


! 213 


[460-1290] 


none detected 


[460-1290] 


[1449-1454] 


[1473-1489] 


| 214 


[21-539] 


none detected 


[21-539] 


[741-746] 


[760-776] 


i 215 


[34-1 143] 


none detected 


[34-1 143] 


[1375-1380] 


[1397-1412] 


j 216 


[6-1184] 


none detected 


[6-1184] 


[1735-1740] 


[1744-1773] 


j 217 


[29-376] 


none detected 


[29-376] 


None detected 


[1184-1251] 


| 218 


[78-566] 


none detected 


[78-566] 


[858-863] 


[878-894] 


j 219 


[16-705] 


none detected 


[16-705] 


[868-873] 


[894-910] 


220 


[103-405] 


none detected 


[103-405] 


[482-487] 


[503-519] 


221 


[72-350] 


none detected 


[72-350] 


[593-598] 


[616-632] 


222 


[38-436] 


none detected 


[38-436] 


None detected 


[636-652] 


223 


[38-322] 


none detected 


[38-322] 


None detected 


[634-650] 


224 


[202-480] 


none detected 


[202-480] 


[472-477] 


[488-502] 


225 


[171-1670] 


none detected 


[171-1670] 


[1706-1711] 


[1725-1739] 



491 



WO 01/42451 



PCT/IB00/01938 



226 


[199-618] 


none detected 


[199-618] 


[626-631] 


[643-657] 


227 


[182-481] 


none detected 


[182-481] 


None detected 


[874-888] 


228 


[161-517] 


none detected 


[161-517] 


None detected 


[701-716] 


229 


[86-505] 


none detected 


[86-505] 


[618-623] 


[638-654] 


230 


[56-382] 


none detected 


[56-382] 


[598-603] 


[619-635] 


231 


[56-355] 


none detected 


[56-355] 


[597-602] 


[618-634] 


232 


[76-498] 


none detected 


[76-498] 


[546-551] 


[567-583] 


233 


[199-600] 


none detected 


[199-600] 


[705-710] 


[737-753] 


234 


[211-612] 


none detected 


[211-612] 


[717-722] 


[746-762] 


235 


[5-259] 


none detected 


[5-259] 


[502-507] 


[521-537] j 


236 


[23-370] 


none detected 


[23-370] 


[956-961] 


[978-994] 


237 


[41-352] 


none detected 


[41-352] 


None detected 


[646-662] 


238 


[3-1319] 


none detected 


[3-1319] 


[1791-1796] 


[1813-1829] 


239 


[421-768] 


none detected 


[421-768] 


[1045-1050] 


[1067-1083] 


240 


[78-590] 


none detected 


[78-590] 


None detected 


[1815-1831] 


241 


[78-608] 


none detected 


[78-608] 


None detected 


[1814-1830] 
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Table III 



List of variants 

92; 119 

14;15 
1 10; 111 
69;174;76 

2;12 
172; 176; 177 
150;152;164;"T66 
154;162 
77; 143 

34;62 
230;231 

63;68 

8;47 
48;49;66 

Tj2 
160; 161 
144;175 

17;21 

31;32 
5;6 

3;10 
96;121 
37;41;59 

70;71 

19;24 
186;195;204 

73;74 
240;241 
221;235 
222;223 

42;45 
157; 163 
190;229 
117;137 
122;233;23~4~ 
201;202 
80;139 
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Table IV 





r reicrentiany exciuaea iragmeiiis 


i 
i 


19? ?lS-?099 2201 


7 


174 ??S-160S 1611 




1111 1 ?4S 




1 S00 1 SOS- 1607 1691 


D 


1 18S 14S1 


o 


1S71 1619 


7 


171? 1768 
i / jz. . i /oo 


8 


1 494 1 S 1 0 


Q 

y 


S70 889 

J /U..OOZ 


i n 


1176 1918-1710 1749-1811 1840 

11 /O.-IZIO,! / 1U..1 /H-Z, 10JJ..1 OH7 


1 1 


910 9S1-4SS S6S 


1 9 
1 Z 


178 990-1616 1661 


1 1 


790 744 


1 A 


700 897-171S 17S0 

/yu.-oz/,! / JJ..1 / »)7 


1 c 
1 J 


788 89S-1711 1 7SS 
/oo..oZj,l / /DO 


1 6 
1 O 


099 016 


1 7 
1 / 


668 747 


1 8 
1 o 


1 870 1 884 


1 Q 

i y 


677 601 

O / / ..07 1 


9H 
ZU 


1 1 94 1118 
1 lZ*t.. 1 1 JO 


9 1 
Z 1 


4S0 468 


ZZ 


101 41 1 -706 790 
J7J ..HI 1,/UO.. / ZU 


91 
Z J 


711 797 


94 
Z^t 


4S6 470 


9S 


876 098-071 087 

O / 0.."ZOj" / J..7O / 


96 
ZO 


804 008 


97 

z / 


748 76? 
/to. . / 


98 

ZO 


1 088 1 1 0? 

1 WOO. . 1 1 \JZ. 


9Q 
zy 


All 416 


10 


1870 1018-10?! 1018 

IO / y..l7lO,l Z7^3 .. lyJO 


1 1 


774 1116 

/ /t..! 1 1U 


19 
jz 


77? 1114 


11 


90S6 ?07? 


14 


191 409 




784 816 


36 


544.. 551;1307.. 1323 


37 


1867..1874;1929..1945 


38 


1315..1330 


39 


2108. .2124 


40 


413. .421;1116.. 1159 


41 


1863..1870;1936..1953 


42 


1623.. 1688 
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43 


1895.. 1942 


44 


1640.. 1657 


45 ! 


1661. .1733 


46 


1555..1871 


47 


1507.. 1523 


48 


541. .832 


49 


540..831 


50 


901. .917 


51 


2..10;605..621 


52 


585. .673 


53 


885..897 


54 


4..13;761..1101 


55 


1031..1047 


56 


873..905;907..923 


57 


1224.. 1240 


58 


861.. 902 


59 


1842.. 1849; 1955.. 1969 


60 


1116.. 1132 


61 


15..46;615..631 


62 


651. .722 


63 


1426.. 1442 


64 


739..795 


65 


1220.. 1236 


66 


520..881 


67 


413..524 


68 


1444.. 1472 


69 


1721..1737 


70 


1621. .1637 


71 


1620. .1636 


72 


777..784;1742..1758 


73 


1631..1647 


74 


1630..1646 


75 


1947.. 1963 


76 


1741..1757 


77 


1561..1913;2011..2027 


78 


727..819;880..894;901..1280;1841..1880 


79 


418..584 


80 


331..353;844..1214;1337..1351 


81 


706..720 


82 


639..713;1008..1029 


83 


1454.. 1788 


84 


712..805 


85 


800..814 


86 


584..598 


87 


122..308;593..699 


88 


855..905 
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89 


500..514 


90 


81..101;198..205;504..518 


91 


650. .808 


92 


128..201;723..737 


93 


714..728 


94 


568..582 


95 


1761. .1773;1898.. 1913 


96 


654.. 670 


97 


883..938 


98 


616..661 


99 


631. .647 


100 


853.. 1006 


101 


537..544;949..1059 


102 


498..514 


103 


1142..1158 


104 


1524.. 1563 


105 


1230..1259;1606..1621 


106 


505..557 


107 


584..600 | 


108 


378..385;1113..1129 


109 


729..778 


110 


992.. 1301 


111 


991. .1300 


112 


1131..1139;1569..1617 j 


113 


1526.. 1634 | 


114 


457..509;677..693 


115 


768..784 


116 


360..670;788..804 


117 


435..484 


118 


433..452;764..985 


119 


128..201;801..839 


120 


554..564;567..583 


121 


872. .908;1008.. 1024 


122 


744..760 


123 


578..594 


124 


94..102;248..559 j 


125 


728..744 


126 


809..824 


127 


510..526 


128 


602..618 


129 


472..553;569..776 


130 


983. .998 


131 


396..468;763..779 


132 


478..532;1009..1025 


133 


592..607 


134 


761. .774 
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135 ! 


556..563;601..611 


136 


887..919 


137 


658..674 


138 


1651..1725 


139 


49..71;988..1358;1458..1474 


140 


324..653 


141 


720..730;1449..1490 


142 


44..119;498..505;578..585;645..661 


143 


1322..1666;1773..1789 


144 


1828..1897;1919..1968;1990..2006 


145 


936..955;1060..1096 


146 


778..827;1650..1666 


147 


1170..1207;1647..1687 


148 


1733..1747 


149 


579..658 


150 


1432..1440;1728..1778;2004..2045 


151 


772. .788 


152 


1496.. 1504;1792..1842;1915.. 1931 


153 


500..514 


154 


1167.. 1183 


155 


1529.. 1545 


156 


703.. 1068 


157 


873. .881;1081.. 1097 


158 


878..894 


159 


688..703 


160 


833..849 


161 


830..846 


162 


11 59. 1176 


163 


869..876;1068..1084 


164 


1444.. 1463; 1496.. 1 504; 1 743.. 1 793 


165 


1233..1319;1697..1849 


166 


1407.. 1426; 1459.. 1467; 1 694.. 1 748 


167 


1258..1275 


168 


84..129;1002..1023 


169 


436..472;596..604;673..689;732..954;995..1085 


170 


767..776 


171 


1203..1219 


172 


141 1..1487 ; 


173 


1861..1915 


174 


1974.. 1990 ; 


175 


1800..1869;1891..1940;1955..1971 


176 


1597..1613 


177 


186..212;1277..1361 


178 


930..978;1002..1113 


179 


951. .1000;1364..1533;1944.. 1960 | 


180 


1427.. 1443 
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181 


107..181;276..311;449..605 


182 


1 143. .1450;1677.. 1724 


183 


1..251;648..655;1347..1686 


184 


447.. 463 


185 


150..159;623..773 


186 


340..476;739..753 


187 


740..754 


188 


307..315;668..998 


189 i 


118..125;529..536;591..605 


190 


492..526 


191 


872. .910 


192 


525. .668 


193 


91..135;461..637 


194 


392..458;551..671;692..706 


195 


656..670 


196 


283..379;458..466;496..510 


197 


1..96;483..500 


198 


625.. 667 


199 


474..513 


200 


370.. 462 


201 


535..551 


202 


534..550 


203 


3 74.. 408 


204 


651.. 665 


205 


994.. 1008 


206 


348..455 


207 


733. .749 


208 


1..49;578..594 


209 


2082..2098 


210 


412..428 


211 


689..769 


212 


898..914 


213 


1266.. 1489 


214 


760..776 


215 


1304..1311;1383..1412 


216 


648..691;1711..1773 


217 


644..856;910..1251 


218 


878..894 


219 


894..910 


220 


503..519 


221 


616..632 


222 


636..652 


223 


634..650 


224 


50..57;488..502 


225 


534..577;1725..1739 


226 


1 643. .657 
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227 


1..84;874..888 


228 


701..716 


229 


638..654 


230 


263..573;619..635 


231 


263..573;619..635 


232 


567..583 


233 


737..753 


234 


746.. 762 


235 


499..537 


236 


905..912;944..994 


237 


348.. 662 


239 


829.. 1083 


240 


1508.. 1831 


241 


1507.. 1830 



499 



WO 01/42451 



PCT/IB00/01938 



Table Va 



Ot L| III 

No 


Prpfprptitifi 11 v pypIuHpH fraampnt« 

X 1 C1C1 dlllUlljr VAVIUUCU 11 14 £^111 C 11 13 


Prpfprpntiallv inclnHprl fVaompnts 


1 


[ 1 -540] ; [5 56-6 1 5] ; [206 1 -2096] ; [2098-220 1 ] 

L J " L JJL J"L J 


[54 1 -5 5 5] ; [6 1 6-2060] ; [2097-2097] 


2 


[1-51 l];[533-619];[621-690];[730-l 132] 

LJ'L J " L J'L J 


[512-53 2] ;[620-620]; [69 1 -729] ;[ 1 1 3 3 - 1 63 1 ] 

L J/L J ' L J " L -■ 


3 


[2-539];[l 178-1245] 


[1-1];[540-11771 

L JJL J 


4 


[l-250];[297-383];[386-514];[1025-1064] 


[251-296];[384-385];[515-1024];[1065-1623] 


5 


[27-1 16];[1 18-391] 


[1-26];[1 17-1 17];[392-1454] 


6 


[ 1 -93 ] ; [96- 1 68] ; [ 1 70-262] ; [264-46 1 ] 


[94-95];[169-169];[263-263];[462-1639] 


7 


ri-951-r97-4511 


[96-96];[452-1768] 


8 


ri-5021-ri314-14911 


[503-1 3 1 3];[ 1492- 1510] 


9 


[1-864] 


[865-882] 


10 


[1-428] 


[429-1849] 


11 


[1-454]; [482-5 14] 


[455-48 1];[5 15-565] 


12 


ri-3751T379-51 H-r533-6901T730-7831-r814- 
1164] 


[376-378];[512-532];[691-729];[784-813];[1165- 
1663] 


13 


[2-337];[339-556] 


[l-l];[338-3381;r557-744] 

L J ? L J>L J 


14 


[29-366];[368-507] 


[l-28];[367-367];[508-1759] 


15 


[29-366];[368-524] 


[l-28];[367-3671;[525-1755] 


16 


[1-641] 


[642-936] 


17 


[1-708];[71 1-747] 


[709-710] 


18 


[1-639] 


[640-1884] 


19 


[1-631] 


[632-691] 


20 


[3 -4 16]; [4 18-490] 


[l-2];r417-4171;[491-1138] 


21 


[1-468] 


None 


22 


[1-720] 


None 


23 


[1-711] 


[712-727] 


24 


[1-469] 


[470-470] 


25 


[l-231];[234-488] 


[232-233];[489-987] 


26 


[ 1 -2 96] ; [3 00-642] ; [644-73 7] 


[297-299];[643-643];[738-908] 


27 


[l-306];[308-762] 


[307-307] 


28 


[1-446]; [448- 1102] 


[447-447] 


29 


[1-436] 


None 




T7-3341 ■ Tl 420- 14681- T 1474-1 61 41- T 161 6- 
1804];[1845-1919] 


ri-61-r335-14191-ri469-14731-ri615- 
1615];[1805-1844];[1920-1938] 


31 


T 1 -3421 ■ T345-5 1 91 ■ T823-8931- T977- 1 0 1 61 


[343.3441 • T520-8221 • T894-9761 • \ 1 0 1 7- 1 1 1 61 


32 


[1-517];[821-891];[975-1014] 


[5 1 8-820] ; [8 92-974] ;[ 1 0 1 5 - 1 1 1 4] 


! 33 


[36-352];[354-457];[728-832];[834- 
1412];[1726-1873] 


[l-35];[353-353];[458-727];[833-833];[1097- 
1725];[ 1874-2072] 


34 


[1-409] 


None 


35 


[14-105] 


[l-13];[106-836] 


36 


[1-572];[1 120-1271] 


[573-1 119];[1272-1323] 


37 


[20-98];[100-510];[1591-1681];[1683-1870] 


[1-19];[99-99];[511-1590];[1682-1682];[1871- 
1945] 
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38 


[1-547] 


[548-1330] 


39 


[1-445] 


[446-2124] 


40 


[l-473];[475-528] 


[474-474];[529-1159] 


41 


[16-506];[1587-1866] 


[1-15];[507-1586];[1867-1953] 


42 


[2-234] ; [244-45 1 ] ; [974- 1 226] 


[l-l];[235-243];[452-973];[1227-1688] 


43 


[1-455];[1670-1925] 


[456- 1669];[ 1926- 1942] 


44 


[1-579];[815-1031] 


[580-814];[1032-1657] 


45 


[1-489];[1012-1264] 


[490-101 1];[1265-1733] 


46 


[1-400];[1184-1223];[1225-1705];[1740-1818] 


[401-1183];[1224-1224];[1706-1739];[1819- 
1871] 


47 


[1-529];[1326-1505] 


[530-1325];[1506-1523] 


48 


[l-131];[133-510];[560-589] 


[132-132];[511-559];[590-832] 


49 


[l-130];[132-509];[559-588] 


[131-131];[510-558];[589-831] 


50 


[l-650];[652-868];[873-913] 


[651-651];[869-872];[914-917] 


51 


[l-504];[5 15-605] 


[505-5 14];[606-621] 


52 


[1-535] 


[536-673] 


53 


[2-563] 


[1-1]; [564-897] 


54 


[ 1 -527];[802-870] ; [882-934] ; [966- 
I0l8];[l037-l080] 


[528-801];[871-881];[935-965];[1019- 
1036];[1081-1 101] 


55 


[l-326];[328-505] 


[327-327]; [506- 1047] 


56 


[1-340] 


[341-925] 


57 


[1-528] 


[529-1240] 


58 


[l-l08];[H5-l5l];[l54-340];[342-529] 


[109-1 14];[152-153];[341-341];[530-902] 


59 


[4_485];[l566-l656];[l658-l845] 


[l-3];[486-1565];[1657-1657];[1846-1969] 


60 


[1-283] 


[284-1132] 


61 


[9-468] 


[1-8]; [469-631] 


62 


[l-525];[689-722] 


[526-688] 


63 


[ l -88] ; [90- 1 92]; [ 1 94-265] ; [296-409] 


[89-89];[193-193];[266-295];[410-1442] 


64 


[1-517] 


[518-795] 


65 


[l -406]; [408-73 9] 


[407-407];[740-1236] 


66 


[l-489];[849-88l] 


[490-848] 


67 


[1-505] 


[506-524] 


68 


[l-325];[328-44l];[444-504] 


[326-327];[442-443];[505-1472] 


69 


[l -524];[636-7 1 5]; [7 1 7-809] ; [8 1 1 -885] ;[ 1 567- 
1715] 


[525-635];[716-716];[810-810];[886- 
I566];[l7l6-l737] 


70 


[12-487] 


[l-ll];[488-l637] 


71 


[12-487] 


[l-ll];[488-l636] 


72 


[1-451] 


[452-1758] 


73 


[1-1 67]; [242-464] 


[168-241];[465-1647] 


74 


[1-1 67]; [242-464] 


[168-241];[465-1646] 


75 


[1-471] 


[472-1963] 


76 


[l-358];[360-543];[655-734];[736-828];[830- 
904];[1586-1734] 


[359-359];[544-654];[735-735];[829-829];[905- 
1585];[1735-1757] 


77 


[3 -34] ; [3 6-474] ; [5 82-770] ; [ 1 709- 1 746] ;[ 1 748- 
1785];[1825-1899] 


[l-2];[35-35];[475-581];[771-1708];[1747- 
1747];[1786-1824];[1900-2027] 


78 


[l-75];[77-319];[914-1052];[1063- 
1126];[1 168-1203] 


[76-76] ; [320-9 1 3] ;[ 1 053 - 1 062] ;[ 1 1 27- 
1167];[ 1204- 1880] 
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79 


[1-425] 


[426-584] 


80 


[l-752];[947-1017];[1084-l 170] 


[753-946];[1018-1083];[l 171-1351] 


81 


[1-496]; [498-720] 


[497-497] 


82 


[1-324] 


[325-1029] 


83 


[1-477];[1474-1529];[1537-1566];[1S77- 
1616];[1622-1662];[1717-1753] 


[478-1473];[1530-1536];[1567-1576];[1617- 
1621];[1663-1716];[1754-1788] 


84 


[ 1 -496] ; [499-568];[752-805] 


[497-498];[569-751] 


85 


[1-527] 


[528-814] 


86 


[1-360] 


[361-598] 


87 


[l-78];[80-583];[625-699] 


[79-79];[584-624] 


88 


[1-889] 


[890-905] 


89 


[1-513] 


[514-514] | 


90 


[l-122];[124-155];[157-435];[437-517] 


[123-123];[156-156];[436-436];[5 18-518] 


91 


[l-133];[165-808] 


[134-164] 


92 


[1-725] 


[726-737] 


) 93 


[1-409] 


[410-728] 


94 


[1-331] 


[332-582] 


95 


[1-410] 


[411-1913] 


96 


[1-501] 


[502-670] 


97 


[1-141];[143-431] 


[142-142];[432-939] 


98 


[1-193] 


[194-661] 


99 


[1-629] 


[630-647] 


100 


[ 1 -520]; [862-954] ; [976- 1 005] 


[521-861]; [955-975]; [ 1 006- 1 006] 


101 


[1-489];[581-961];[1010-1059] 


[490-5 80]; [962- 1009] 


102 


[1-485] 


[486-514] 


103 


[1-540] 


[541-1158] 


104 


[1-556] 


[557-1563] 


105 


[l-868];[870-1006] 


[869-869];[1007-1621] 


106 


[1-491] 


[492-557] 


107 


[1-573] 


[574-600] 


108 


[l-457];[586-1110] 


[458-585];[llll-1129] 


109 


[l-521];[655-778] 


[522-654] 


110 


[ 1 -4 1 6] ; [478-6 1 4] ; [6 1 6-990] ; [992- 
1065];[1068-1283] 


[4 1 7-477] ; [6 1 5-6 1 5] ; [99 1 -99 1 ];[ 1 066- 
1067];[1284-1301] 


111 


[ 1 -4 1 6] ; [478-6 1 4] ; [62 8-9 8 9] ; [99 1 - 
1064];[1067-1282] 


[4 1 7-477] ; [6 1 5-627]; [990-990] ; [ 1 065- 
1066];[1283-1300] 


112 


[2-429];[1161-1202];[1212-1388];[1392-1589] 


[1-1];[430-1160];[1203-1211];[1389- 
1391];[1590-1617] 


113 


[1-487] 


[488-1634] 


114 


[l-70];[86-496] 


[71-85];[497-693] 


115 


[1-3 58]; [360-55 8] 


[359-359];[559-784] 


116 


[l-215];[218-495];[527-607] 


[2 1 6-2 1 7] ; [496-526] ; [608-804] 


117 


[1-466] 


[467-484] 


118 


[1-5 15]; [906-963] 


[516-905];[964-985] 


119 


[1-744]; [746-8 16] 


[745-745];[817-839] 


120 


[l-85];[87-521] 


[86-86];[522-583] 
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121 


[1-532] 


[533-1024] 


122 


[l-318];[325-517];[567-660] 


[3 1 9-324] ; [5 1 8-5 66] ; [66 1 -760] 


123 


[1-498] 


[499-594] 


124 


[1-427] 


[428-559] 


125 


[1-642] 


[643-744] 


126 


[l-341];[350-696] 


[342-349]; [697-824] 


127 


[1-482] 


[483-526] 


128 


[1-338] 


[339-618] 


129 


[l-191];[193-429];[450-678] 


[192- 192]; [430-449]; [679-776] 


130 


[19-463];[465-544] 


[ 1-1 8]; [464-464] ; [545 -998] 


131 


[1-470] 


[471-779] 


132 


[1-533] 


[534-1025] 


133 


[1-498] 


[499-607] 


134 


[l-168];[170-326];[328-471];[552-738] 


[ 1 69-169] ; [327-327] ; [472-5 5 1 ]; [739-774] 


135 


[ 1 -346] ; [348-395] ; [440-473] 


[347-347];[396-439];[474-61 1] 


136 


[l-324];[343-436] 


[325-342];[437-925] ! 


137 


[1-186];[188-251];[255-517] 


[1 87-1 87];[252-254];[5 18-674] 


138 


[1-488] 


[489-1725] 


139 


[1-101];[103-190];[292-327];[1091- 
1161];[1228-1314] 


[102-102];[191-291];[328-1090];[1162- 
1227];[1315-1474] 


140 

1 —T\J 


H -4651- \5 16-6531 


T466-5151 


141 
iti 


N -761 1- r763-8571*r91 2-1 3261 


r762-7621*r858-91 11* ri327-14901 


142 


[1-476] 


T477-6611 

\~T 1 1 \J\J A J 


143 


Tl -53 1 VT1471 -15081- T 15 10-1 5471 Tl 587-166 11 


r532-14701*n 509-1 5091-n 548-15 861* r 1662- 
1789] 


144 


[l-492];[503-536] 


[493-502];[537-2006] 


145 


[1-570] 


[571-1096] 


146 


[ 1 -53 6] ; [62 1 -703] ; [729- 1 075] ;[ 1 1 98- 1 445] 


[5 37-620] ; [704-728] ; [ 1 076- 1 1 97] ; [ 1 446- 1 666] ; 


147 


[l-555];[578-628] 


[556-577];[629-1687] 


148 


[1-444];[1201-1474];[1480-1516] 


[445-1 200] ;[ 1475 - 1 479] ;[ 1 5 1 7- 1 747] 


149 


[1-6 13]; [626-658] 


[614-625] 


150 


[4- 1 99] ; [20 1 -4 1 9] ; [42 1 -492] 


[ 1 -3] ; [200-200] ; [420-420] ; [493 -2045 ] 


151 


[1-509] 


[510-788] 


152 


[l-483];[485-578] 


[484-484];[579-1931] 


153 


[1-497] 


[498-514] 


154 


[5-509];[579-763];[765-l 162] 


[l-4];[510-578];[764-764];[l 163-1 183] 


155 


[1-486];[1095-1500] 


[487-1094];[1501-1545] 


156 


[ 1 -488] ; [740-797] ; [799-884] ;[895-974] 


[489-739];[798-798];[885-894];[975-1068] 


157 


[l-161];[163-565];[567-701] 


[162-162];[566-566];[702-1097] 


158 


[1-496]; [692-754] 


[497-691];[755-894] 


159 


[1-483] 


[484-703] 


160 


[1-494] 


[495-849] 


161 


[1-491] 


[492-846] 


162 


[l-505];[575-759];[761-l 164] 


[506-574];[760-760];[l 165-1 176] 


163 


[1-699] 


[700-1084] 


164 


[38-483];[485-556] 


[ 1 -37]; [484-484] ; [5 57- 1 793] 
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165 


[1-426];[1303-1444];[1717-1755];[1787-1825] 


[427-1302];[1445-1716];[1756-1786];[1826- 
1849] 


166 


ro o/C/n.ro/CiC aajzltaaq ci oi 
[2-2 64 J ; [266-446J ; [44 o -5 1 9J 


T1 11. rO/TC O/C^l • TA A *7 ylyni.TdA 1"7/1Q1 

[1-1 J;[zoj-zojJ;[44/-44/J;[jzU-l /4oJ 


167 


[1-5 19J;[523-552J 


[5zU-5zzJ 5 [5 53-lz /5J 


168 


[1-457J;[466-571 J 


r>if o /1/:ci.r^71 i ATI] 

[458-465J;[5 /2-1023J 


169 


[l-54];[57-501 J 


rr r C/Cl.rcAO 1 AOfl 

[55-56J;[5U2-lU85J 


170 


[1-541] 


[542-776J 


I7l 


[1-489] 


[490-1219] 


172 


[l-538];[977-1468] 


[539-976] ;[1469-1487] 


173 


[1-631] 


[632-1915] 


174 ! 


[21-776]; [888-967] ; [969- 1 06 1 ] ; [ 1 063- 


[l-20];[777-887];[968-968];[l062-l062];[l 138- 
1 81 R1N QfSR-1 QQ01 


175 


[1-508] 


r c s\r\ 1 m 1 n 

[509-1971] 


176 


ri 1 ^ni r 1 on c ^> on. r m n 1 a 

[1-127J;[129-538J; [979-1470] 


n iooi.rcon moi . r 1 /i~7i i/rion 

[128-128J;[539-y78J;[1471-1613J 


177 


ri r n rmo i i t o n r i 1 1 o o m n i o/i i 

[l-535];[973-l 173J;[1 177-1330];[1332-1361 J 


rco/f m^n.n a a i i t /en . r 1 o o 1 non 

[536-972J;[l 174-1 176J;[133 1-133 1] 


178 


[l-599];[626-830];[1082-l 113] 


v r\r\ /ir i.roo 1 1 AO 1 1 

[600-625 J;[831- 108 1J 


179 


[1-623];[1377-1406] 


[624-1 376];[ 1407-1 960] 


180 


[l-414];[418-464] 


[4l5-4l7];[465-l443] 


I8l 


[l-522];[533-587] 


[523-532];[588-605] 


182 


[l-78];[99-131];[136-327];[l 153-1 184];[1210- 
lz/4J;[lZ54-13 iyj,[13oj-141oJ 


[79-98];[l32-l35];[328-l I52];[l 185- 
ionoi-ri7'7^ 1 9°.n- n ion 1 is/n-n zii ~7 1 

lZUyjj[lZ/D-lZo3J,[13ZU-Uo^J,[l^l /-l /Z^4J 


183 


[1-512];[617-805];[871-952];[1387- 
i ^tzz j j l i oz i-i oo i j 


[5l3-6l6];[806-870];[953-l386];[l423- 
i uzw j j l i tiuz- 1 oouj 


184 


[1-453] 


r A C A A 4Z~>~\ 

[454-463J 


185 


r 1 tti n 

[1-773] 


None 


186 


[1-413]; [423 -604] ; [606-739] 


r a 1 /i /ini.r^Af ^Af i.n/iA ion 

[4 1 4-422 J ; [605 -605 J ; [740-75 3 J 


187 


ri i 1 *n ri in /i/\in 

[l-l I7];[l 19-401] 


T1 10 1 1 Ol.T/IAI Tf jll 

[118-1 18J;[402-754J 


188 


ri r 1 in r/'o /i o^rm roTO nooi.rnoc non 

[1-51 1];[684-870];[872-928];[935-981] 


[5 1 2-683 J ; [87 1 -87 1 J ; [929-934] ; [982-998 J 


189 


[1-605] 


None 


190 


[2-475] 


[1-1 J;[476-526J 


191 


[1-910] 


None 


192 


n 1 Ai i ri ai /'/'oi 

[1-101];[103-668J 


r i AO 1 All 

[102-102J 


193 


ri OALrro /mi 

[1-520J;[583-637J 


rcoi coon 
[521-582] 


194 


[1-706] 


None 


1 nc 

195 


[l-145J;[150-45 1J;[466-670J 


[ 146-1 49J; [452-465 J 


196 


[1-509] 


[510-510J 


197 


ri r r\r\~\ 

[1-500] 


None 


198 


[1-503J;[505-585J 


[5U4-jU4J;[j50-oo /] 


1 C\C\ 

199 


r 1 zircon 

[1-498J 


r A OO C 1 A~\ 

[499-5 14J 


o a a 

200 


[1-462J 


None i 


201 


r 1 ccn 

[1-551] 


None 


202 


ri /lo^i r a o a c c (\~\ 

[1-482]; [484-550] 


r/ioi >ioon 

[483-483J 


OA'S 
Z03 


[l-4UoJ 


None 


204 


[1-5 19]; [52 1-649] 


[520-520];[650-665] 


205 


[ 1 -26 1 ] ; [263-4 1 5 ] ; [4 1 7-640] ; [642-782] 


[262-262] ; [4 1 6-4 1 6] ; [64 1 -64 1 ]; [783- 1 008] 


206 


[1-455] 


None 
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zU / 


[ 1 -4UzJ ; [4 1 U- DzoJ 


r AC\~1 /lftft1.r^T7 H ACk\ 

[4U3-409J;[jz7-749J 


zUo 


n corn 

[1-52UJ 


[j2 1-D94J 


zU9 


n iftTi.roftft ah^~\ 
[1 -19 / J;LzUU-4/zJ 


nfto iftftn.r/m Oftftoi 
[198-1 99J ; [4 /3-ZU95J 


ZlU 


n 1 1 n.ri 1 yi A^m 
[1-31 1 J;[3 14-42 /J 


[3 1 z-3 1 3 J ; [4zo-4z5j 


211 


[l-689];[735-769] 


[690-734] 


212 


[1-517] 


[518-914] 


213 


[2-576]; [756-795];[1390-1441] 


[l-l];[577-755];[796-1389];[1442-1489] 


214 


[1-482] 


[483-776] 


215 


[1-498] 


[499-1412] 


216 


[1-505];[1000-1293];[1295-1408];[1 744-1773] 


[5 06-999] ; [ 1 294- 1 294] ; [ 1 409- 1 743 ] 


217 


[l-l02];[l04-29l];[293-467];[486-708];[723- 
sni-rsi^ onm-roin in^n-no,^ 

1090];[1097-1153] 


[103-103]; [292-292] ; [468-485 ] ; [709-722] ; [832- 
s^9i-roni Qnoi-nn^9 ift^i-nnoi inoAi-niy 

1251] 


O 1 o 

218 


r 1 /ion 

[1-45 2 J 


r/if o oft/in 
[453-894J 


219 


[1-554J;[556-598J 


rccc ccci.rcnn ftifti 
[555-55 5J;[j99-91UJ 


220 


n oon.r/11 ftcn.rfto oo^n.noo /toil 

[1-38J;[41-95J;[98-386J;[388-487J 


[39-40J; [96-97J;[387-387J; [488-5 19J 


221 


[ 1 -34] ; [3 8-220 J ; [222-3 35 J ; [337-5 1 8 J 


[35-37J; [22 1-221 J;[336-336J;[519-632J 


222 


r 1 /i iron 

[1-468] 


T A £1 0 /TOT 1 

[469-652J 


223 


[1-466] 


[467-650J 


224 


[1-466 J 


[467-502J 


225 


ri /i or\i . r/rco 1 aaoi 

[l-489J;[653-1008J 


[490-65 2 J ; [ 1 009- 1 739J 


226 


[1-657J 


None 


227 


r 1 a a f\~\ 

[1-480] 


T/101 OOOl 

[481-888] 


228 


[1-501] 


[502-716] 


229 


[1-612] 


[613-654] 


230 


[l-477];[485-538] 


[478-484];[539-635] 


231 


[l-476];[484-537] 


[477-483]; [538-634] 


232 


[1-367];[371-512] 


[368-370];[5 13-583] 


233 


[ 1 -3 05 ] ; [307-442] ; [460-503] ; [5 5 3-646] 


[306-306]; [443-459];[504-552];[647-753] 


234 


[ 1 -260] ; [262-345 ] ; [347-454] ; [473 -5 1 5 ] ; [565 - 

OJ OJ 


[261-26 1 ]; [346-346]; [455-472] ;[5 1 6-564]; [659- 

/ t)ZJ 


_> 


ri -4971 


|_*-r^O J J / J 


236 


[1-465] 


[466-994] 


237 


[ 1 -47 1 ] ; [496-526] ; [557-5 87] ;[597-637] 


[472-495];[527-556];[588-596];[638-662] 


238 


[1-338]; [352-497] 


[339-35 1];[498-1829] 


239 


[1-501] 


[502-1083] 


240 


[1-515];[1527-1583];[1585-1687];[1692-1831] 


[516-1526];[1584-1584];[1688-1691] 


241 


[1-515];[1526-1582];[1584-1686];[1691-1830] 


[516-1525];[1583-1583];[1687-1690] 
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Table Vb 



Seq Id No 


Preferentially excluded fragments 


Preferentially included fragments 


1 


[ 1 -540] ; [556-615]; [206 1 -2096] ; [2098-220 1 ] 


[54 1 -5 5 5 ]; [6 1 6-2060] ; [2097-2097] 


2 


[ 1 -5 1 1 ]; [5 33-6 1 9] ; [62 1 -690] ; [730- 1132] 


[5 1 2-532] ; [620-620] ; [69 1 -729] ;[1 133-1631] 


3 


[2-539];[l 178-1245] 


[1-1];[540-1 177] 


4 


[ 1 -250] ; [297-3 83] ; [3 86-5 1 4] ;[ 1 025 - 1 064] 


[25 1 -296] ; [3 84-3 85 ]; [5 1 5- 1 024] ; [ 1 065 - 1 623 ] 


5 


[27-1 16];[1 18-391] 


[1-26];[1 17-1 17];[392-1454] 


6 


[ 1 -93] ; [96- 1 68] ; [ 1 70-262] ; [264-46 1 ] 


[94-95 ] ; [ 1 69- 1 69] ; [263-263] ; [462- 1 639] 


7 


[l-95];[97-451] 


[96-96];[452-1768] 


8 


[1-502];[1314-1491] 


[503-1313];[1492-1510] 


9 


[1-864] 


[865-882] 


10 


[1-428] 


[429-1849] 


11 


[1-454]; [482-5 14] 


[455-481];[515-565] 


12 


[l-375];[379-511];[533-690];[730-783];[814- 

1 lo4J 


[376-378];[512-532];[691-729];[784-813];[1165- 
loo J J 


13 


[2-337];[339-556] 


[l-l];[338-338];[557-744] 


14 


[29-366];[368-507] 


[l-28];[367-367];[508-1759] 


15 


[29-366];[368-524] 


[l-28];[367-367];[525-1755] 


16 


[1-641] 


[642-936] 


17 


[1-708]; [7 11-747] 


[709-710] 


18 


[1-639] 


[640-1884] 


19 


[1-631] 


[632-691] 


20 


[3-4 16]; [4 18-490] 


[1-2];[417-417];[491-1138] 


21 


[1-468] 


None 


22 


[1-720] 


None 


23 


[1-711] 


[712-727] 


24 


[1-469] 


[470-470] 


25 


[1-231]; [234-488] 


[232-233];[489-987] 


26 


[ 1 -296] ; [300-642] ; [644-737] 


[297-299] ; [643 -643] ; [73 8-908] 


27 


[l-306];[308-762] 


[307-307] 


28 


[1-446]; [448- 1102] 


[447-447] 


29 


[1-436] 


None 


30 


[7-334];[1420-1468];[1474-1614];[1616- 
1804];[1845-1919] 


[l-6];[335-1419];[1469-1473];[1615-1615];[1805- 
1844];[1 920-1938] 


31 


[ 1 -342] ; [345 -519]; [823 -893] ; [977- 1016] 


[343-344];[520-822];[894-976];[1017-l 1 16] 


32 


[1-517];[821-891];[975-1014] 


[518-820];[892-974];[1015-1114] 


33 


[36-352];[354-457];[728-832];[834- 

1096];[1253-1289];[1291-1350];[1352- 

1412];[1726-1873] 


[l-35];[353-353];[458-727];[833-833];[1097- 
1252];[1290-1290];[1351-1351];[1413- 
1725]; [1874-2072] 


34 


[1-409] 


None 


35 


[14-105] 


[l-13];[106-836] 


36 


[1-572];[1 120-1271] 


[573-1 119];[1272-1323] 


37 


[20-98];[100-510];[1591-1681];[1683-1870] 


[1-19];[99-99];[511-1590];[1682-1682];[1871- 
1945] 


38 


[1-547] 


[548-1330] 
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39 


[1-445] 


[446-2124] 


40 


[l-473];[475-528] 

L J " L J 


[474-474];[529-1159] 

L J " L J 


41 


[16-506];[1587-1866] 


[1-15];[507-1586];[1867-1953] 


42 


[2-234] ; [244-45 1 ] ; [974- 1 226] 


[l-l];[235-243];[452-973];[1227-1688] 


43 


[1-4jjJ;L lo /U-19zjJ 


[43o-iooyj;[ lyzo- I94zj 


A A 

44 


[l-579J;[81o-1031J 


[5oO-o14J;[1U3z-16j7J 


A C 

45 


[1-489J;[1012-1264J 


TA C\f\ 1 A1 1 O/C? 1 Tin 

[490-101 1 J;[1z6j-1733J 


A f 

46 


r 1 a nm . r 1 1 o a i i^n ri 1 iAci.rn/iA 

[1-400J;[1 184-1 223 J;[ 1225-1 705 J;[l 740- 
1 81 81 

1 O 1 OJ 


rAm 1 1 on.n n>i 1 Tiyii.ri ia^ 1 tiai.ti 01 a 

[40 1 - 1 1 83 J ; [ 1 224- 1 224 J ; [ 1 706- 1 73 9 J ; [ 1 8 1 9- 

1 O / 1 J 


47 


[1-529];[1326-1505] 


[530-1325];[1506-1523] 


48 


[l-131];[133-510];[560-589] 


[132-132];[511-559];[590-832] 


49 


[l-130];[132-509];[559-588] 


[131-131];[510-558];[589-831] 


50 


[l-650];[652-868];[873-913] 


[65 1 -65 1 ] ; [8 69-872] ; [9 1 4-9 1 7] 


51 


[l-504];[5 15-605] 


[505-5 14];[606-621] 


52 


[1-535] 


[536-673] 


53 


[2-563] 


[l-l];[564-897] 


54 


[l-527];[802-870];[882-934];[966- 
1018];[1037-1080] 


[528-801 ];[871 -881 ];[935-965];[1019- 
1036];[1081-1101] 


55 


[l-326];[328-505] 


[327-327];[506-1047] 


56 


[1-340] 


[341-925] 


57 


[1-528] 


[529-1240] 


58 


[1-108];[1 15-15 1];[ 154-340]; [342-529] 


[109-1 14];[152-153];[341-341];[530-902] 


59 


[4-485];[1566-1656];[1658-1845] 


[l-3];[486-1565];[1657-1657];[1846-1969] 


60 


[1-283] 


[284-1132] 


61 


[9-468] 


r 1-8]; [469-631] 

L J 5 L J 


62 


[l-525];[689-722] 


[526-688] 


63 


[1-88] ; [90- 1 92] ; [ 1 94-265] ; [296-409] 


[89-89];[193-193];[266-295];[410-1442] 


64 


[1-517] 


[518-795] 


| 65 


[1-406]; [408-739] 


[407-407];[740-1236] 


66 


[1-489]; [849-881] 


[490-848] 


67 


[1-505] 


[506-524] 


68 


[ 1 -325 ] ; [328-44 1 ] ; [444-504] 


[326-327];[442-443];[505-1472] 


69 


[l-524];[636-715];[717-809];[811- 
885];[1567-1715] 


[525-635];[716-716];[810-810];[886-1566];[1716- 
1737] 


70 


[12-487] 


[1-11];[488-1637] 


71 


[12-487] 


[1-11];[488-1636] 


72 


[1-451] 


[452-1758] 


73 


[1-1 67]; [242-464] 


[1 68-241 ];[465- 1647] 


74 


[1-1 67]; [242-464] 


[1 68-241 ];[465- 1646] 


75 


[1-471] 


T472- 19631 


76 


[l-358];[360-543];[655-734];[736-828];[830- 
904];[1586-1734] 


[359-359];[544-654];[735-735];[829-829];[905- 
1585];[1735-1757] 


77 


[3-34] ; [36-474] ; [582-770] ;[ 1 709- 
1746];[1748-1785];[1825-1899] 


[l-2];[35-35];[475-581];[771-1708];[1747- 
1 747] ;[ 1 786- 1 824] ;[ 1 900-2027] 


78 


[l-75];[77-319];[914-1052];[1063- 
1126];[1 168-1203] 


[76-76] ; [320-9 1 3] ; [ 1 05 3- 1 062] ; [ 1 1 27- 
1167];[1204-1880] 


79 


[1-425] 


[426-584] 
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80 


[l-752];[947-1017];[1084-1170] 


[753-946];[1018-1083];[l 171-1351] 


81 


[1-496]; [498-720] 


[497-497] 


82 


[1-324] 


[325-1029] 


83 


[1-477];[1474-1529];[1537-1566];[1577- 
1616];[1622-1662];[1717-1753] 


[478-1473];[1530-1536];[1567-1576];[1617- 
1621 ];[1 663-1 716];[1 754-1788] 


84 


[l-496];[499-568];[752-805] 


[497-498];[569-751] 


85 


[1-527] 


[528-814] 


86 


[1-360] 


[361-598] 


87 


[l-78];[80-583];[625-699] 


[79-79]; [5 84-624] 


88 


[1-889] 


[890-905] 


89 


[1-513] 


[514-514] 


90 


[l-122];[124-155];[157-435];[437-517] 


[123-123];[156-156];[436-436];[518-518] 


91 


[l-133];[165-808] 


[134-164] 


92 


[1-725] 


[726-737] 


93 


[1-409] 


[410-728] 


94 


[1-331] 


[332-582] 


95 


[1-410] 


[411-1913] 


96 


[1-501] 


[502-670] 


97 


[1-141];[143-431] 


[142-142];[432-939] 


98 


[1-193] 


[194-661] 


99 


[1-629] 


[630-647] 


100 


[l-520];[862-954];[976-1005] 


[521-861]; [955 -975] ;[ 1 006- 1 006] 


101 


[1-489];[581-961];[1010-1059] 


[490-580];[962-1009] 


102 


[1-485] 


[486-514] 


103 


[1-540] 


[541-1158] 


104 


[1-556] 


[557-1563] 


105 


[l-868];[870-1006] 


[869-869];[1007-1621] 


106 


[1-491] 


[492-557] 


107 


[1-573] 


[574-600] 


108 


[l-457];[586-1110] 


[458-585];[llll-1129] 


109 


[l-521];[655-778] 


[522-654] 


110 


[l-416];[478-614];[616-990];[992- 
1065];[1068-1283] 


[417-477];[615-615];[991-991];[1066- 
1067];[1284-1301] 


111 


[1-416];[478-614];[628-989];[991- 
1064];[1067-1282] 


[417-477];[615-627];[990-990];[1065- 
1066];[1283-1300] 


112 


[2-429];[1161-1202];[1212-1388];[1392- 
1589] 


[1-1];[430-1 160];[1203-121 1];[1 389-1 391 ];[1 590- 
1617] 


113 


[1-487] 


[488-1634] 


114 


[l-70];[86-496] 


[71-85];[497-693] 


115 


[l-358];[360-558] 


[359-359];[559-784] j 


116 


[l-215];[218-495];[527-607] 


[2 1 6-2 1 7] ; [496-526] ; [608-804] 


117 


[1-466] 


[467-484] 


118 


[l-515];[906-963] 


[516-905];[964-985] 


119 


[1-744]; [746-8 16] 


[745-745] ;[8 17-839] 


120 


[l-85];[87-521] 


[86-86]; [522-5 83] 


121 


[1-532] 


[533-1024] 
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122 


[ 1 -3 1 8] ; [325-5 1 7] ; [567-660] 


[3 1 9-324] ; [5 1 8-566]; [66 1 -760] 


123 


[1-498] 


[499-594] 


124 


[1-427] 


[428-559] 


125 


[1-642] 


[643-744] 


126 


[l-341];[350-696] 


[342-349]; [697-824] 


127 


[1-482] 


[483-526] 


128 


[1-338] 


[339-618] 


129 


[l-191];[193-429];[450-678] 


[ 1 92-1 92] ; [430-449]; [679-776] 


130 


[19-463];[465-544] 


[1-18]; [464-464] ; [545 -998] 


131 


[1-470] 


[471-779] 


132 


[1-533] 


[534-1025] 


133 


[1-498] 


T499-6071 


1 34 


Tl - 1 681 • T 1 70-3261 • R28-47 1 1 ■ r5 S2-73 81 


r 1 69-1 69V R27-3271- T472-55 1 1- T739-7741 


135 


Tl -1461 * r348-3951 * r440-4731 


R47-3471-R96-439VT474-61 11 


1 36 


ri-324ir343-4361 


R25-342H437-9251 


137 


[ 1 -1 86]; [ 1 88-25 1 ]; [255-5 1 7] 


[ 1 87- 1 87] ; [252-254] ; [5 1 8-674] 


138 


[1-488] 


[489-1725] 


139 


[1-101];[103-190];[292-327];[1091- 

1 101J,[lZZo-l j 14J 


[102-102];[191-291];[328-1090];[1162- 

1 oo'7i - r i n ^ i/i*7/n 
1ZZ / J,[1J1j-14/4J 


140 


[l-465];[5 16-653] 


[466-515] 


141 


[ l -76 1 ] ; [763 -8 5 7] ; [9 1 2- 1 3 26] 


[762-762] ; [8 5 8-9 1 1 ]; [ 1 327- 1 490] 


142 


[1-476] 


[477-661] 


143 


[1-531];[1471-I508];[l5l0-I547];[l587- 

1 00 1 J 


[532-l470];[l509-l509];[l548-l586];[l662- 

1 78Q1 


144 


M-4921T503-5361 


[493-502]; [5 37-2006] 


145 


[1-570] 


[571-1096] 


146 


[l-536];[621-703];[729-1075];[l 198-1445] 


[5 3 7-620] ; [704-72 8] ; [ 1 076- 1 1 97] ; [ 1 446- 1 666] 


147 


ri-5551T578-6281 


[556-577];[629-1687] 


148 


r 1-4441- r 1201 -14741- T 1480-1 5 161 


[445-1200];[1475-1479];[l 5 17-1747] 


149 


[1-6 13]; [626-65 8] 


[614-625] 


150 


[4- 1 99] ; [20 1 -4 1 9] ; [42 1 -492] 


[ 1 -3] ; [200-200] ; [420-420] ; [493-2045] 


151 


[1-509] 


[510-788] 


152 


[l-483];[485-578] 


[484-484];[579-1931] 


153 


[1-497] 


[498-514] 


154 


[5-509];[579-763];[765-l 162] 


[l-4];[510-578];[764-764];[l 163-1183] 


155 


[1-486];[1095-1500] 


[487-1094];[1501-1545] 


156 


[l-488];[740-797];[799-884];[895-974] 


[489-739];[798-798];[885-894];[975-1068] 


157 


[l-161];[163-565];[567-701] 


[162-162];[566-566];[702-1097] 


158 


[l-496];[692-754] 


[497-691];[755-894] 


1 c o 

159 


r 1 a on 
[1-483] 


|4o4-/UjJ 


160 


[1-494] 


[495-849] 


161 


[1-491] 


[492-846] 


162 


[l-505];[575-759];[761-1164] 


[506-574];[760-760];[l 165-1 1 76] 


163 


[1-699] 


[700-1084] 


164 


[38-483];[485-556] 


[l-37];[484-484];[557-1793] 


165 


[1-426];[1303-1444];[1717-1755];[1787- 
1825] 


[427-1302];[1445-1716];[1756-1786];[1826- 
1849] 
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166 


[2-264];[266-446];[448-5 19] 


[ 1 - 1 ]; [265-265]; [447-447]; [520- 1 748] 


167 


[l-519];[523-552] 


[520-522];[553-1275] 


168 


[l-457];[466-571] 


[458-465];[572-1023] 


169 


[l-54];[57-501] 


[55-56];[502-1085] 


170 


[1-541] 


[542-776] 


171 


[1-489] 


[490-1219] 


172 


[l-538];[977-1468] 


[539-976];[1469-1487] 


173 


[1-631] 


[632-1915] 


174 


[21-776];[888-967];[969-1061];[1063- 
1 137];[1819-1967] 


[l-20];[777-887];[968-968];[1062-1062];[l 138- ! 
i oioj;Liyoo-iyyuj 


175 


[1-508] 


[509-1971] 


176 


[ 1 - 1 27] ; [ 1 29-5 3 8] ; [979- 1 470] 


[128-128];[539-978];[147 1-1613] 


177 


[l-535];[973-1173];[1177-1330];[1332-1361] 


[536-972];[l 174-1 176];[133 1-1331] 


178 


[l-599];[626-830];[1082-1113] 


[600-625];[83 1-1081] 


179 


[1-623];[1377-1406] 


[624-1376];[1407-1960] | 


180 


[l-414];[418-464] 


[415-417];[465-1443] ! 


181 


[l-522];[533-587] 


[523-532];[588-605] 


182 


[l-78];[99-131];[136-327];[1153- 

l I84];[l2l0-I274];[l284-I3l9];[l385-I4l6] 


[79-98];[132-135];[328-1152];[1185-1209];[1275- 
1 283]; [1320-1 3 84];[1417- 1724] 


183 


[1-512];[617-805];[871-952];[1387- 
1422];[1621-1661] 


[513-616];[806-870];[953-1386];[1423- 
1 620]; [ 1 662- 1 ooo] 


184 


[1-453] 


[454-463] 


185 


[1-773] 


None 


186 


[1-413]; [423 -604] ; [606-7 3 9] 


[414-422];[605-605];[740-753] 


187 


[1-117];[119-401] 


[118-118];[402-754] 


188 


[1-511]; [684-870] ; [872-928] ; [93 5 -98 1 ] 


[5 1 2-683] ; [87 1 -87 1 ]; [929-934] ; [982-998] 


189 


[1-605] 


None 


190 


[2-475] 


[l-l];[476-526] 


191 


[1-910] 


None 


192 


[l-101];[103-668] 


[102-102] 


193 


[l-520];[583-637] 


[521-582] 


194 


[1-706] 


None 


195 


[ 1 -145]; [1 50-45 1];[466-670] 


[146-149];[452-465] 


196 


[1-509] 


[510-510] 


197 


[1-500] 


None 


198 


[l-503];[505-585] 


[504-504];[586-667] 


199 


[1-498] 


[499-514] 


200 


[1-462] 


None 


201 


[1-551] 


None 


202 


[1-482]; [484-550] 


[483-483] 


203 


r 1 A AOl 

[1-408] 


None 


204 


[l-519];[521-649] 


[520-520];[650-665] 


205 


[l-261];[263-415];[417-640];[642-782] 


[262-262];[416-416];[641-641];[783-1008] 


206 


[1-455] 


None 


207 


[l-402];[410-526] 


[403-409];[527-749] 


208 


[1-520] 


[521-594] 
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| 209 


[l-197];[200-472] 


[198-199];[473-2098] 


z i u 




[.) i z-o i jj 5 [ i tzo-^fZoj 


211 


[l-689];[735-769] 


[690-734] 


212 


[1-517] 


[518-914] 


213 


[2-5 76J ; [756-795 J ; [ 1 390-1441 J 


r i 1 i.rm nrn nnzr 100m n a a ^ 1 ^ om 

[1-1J;[577-755J;[796-1389J;[1442-1489J 


214 


[1-482] 


r A O O T7/T1 

[483-776J 


215 


[1-498] 


r /I 1 A t ^1 

[499-1412] 


216 


r 1 r" /\ f- -| f--i /\/\ /\ 1 aATi n onr i ^ aoi ri^7/i/i 

[1 -505] ;[1000-1293J;[1295-1408J;[ 1744- 
1 77^1 


rr a/ nr\m n i /i i ri/irvn n>m 

[506-999 J; [ 1 294- 1 294] ;[ 1 409- 1 743 J 


217 


[ 1 - 1 02] ; [ 1 04-29 1 ]; [293-467] ; [486-708] ; [723- 

831];[833-900];[910-1031];[1054- 

1090];[1097-1153] 


[103-103];[292-292];[468-485];[709-722];[832- 
832] ;[90 1 -909]; [ 1 032- 1 05 3] ;[ 1 09 1 - 1 096] ;[ 1 1 54- 
1251] 


218 


[1-452] 


[453-894] 


219 


[l-554];[556-598] 


[555-555];[599-910] 


220 


[l-38];[41-95];[98-386];[388-487] 

L J " L J "L J'L J 


[39-40];[96-97];[387-387];[488-519] 


221 


[l-34];[38-220];[222-335];[337-518] 

L J " L J'L J'L J 


[35-37];[221-221];[336-336];[519-632] 

L J / L J'L J'L J 


222 


[1-468] 


[469-652] 

L J 


223 


[1-466] 


[467-650] 


224 


[1-466] 


[467-502] 


225 


[l-489];[653-1008] 


[490-652]; [1009-1 739] 


226 


[1-657] 


None 


227 


[1-480] 


[481-888] 


228 


[1-501] 


[502-716] 


229 


[1-612] 


[613-654] 

L J 


230 


[l_477] ; [485-538] 


[478-484];[539-635] 


231 


[l_476];[484-537] 


[477-483];[538-634] 


232 


[1-367];[371-512] 


[368-370];[513-583] 


233 


[ 1 -305 ] ; [307-442] ; [460-503] ; [5 5 3-646] 


[306-306];[443-459];[504-552];[647-753] 


234 


[l-260];[262-345];[347-454];[473-515];[565- 
6581 


[26 1 -26 1 ] ; [346-346]; [45 5-472] ; [5 1 6-5 64] ; [65 9- 
762] 


235 


[1-427] 


[428-537] 


236 


[1-465] 


[466-994] 


237 


[l-471];[496-526];[557-587];[597-637] 


[472-495];[527-556];[588-596];[638-662] 


238 


[l-338];[352-497] 


[339-351];[498-1829] 


239 


[1-501] 


[502-1083] 


240 


[1-515];[1527-1583];[1585-1687];[1692- 
1831] 


[516-1526];[1584-1584];[1688-1691] 


241 


[1-515];[1526-1582];[1584-1686];[1691- 
1830] 


[516-1525];[1583-1583];[1687-1690] 
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Table VI 



Seq Id No 


Designation of domain 


Database 


Positions of 
domains 


242 


Cell attachment seauence 


PROSITE 


141-143 


242 


Peptidase family M20/M25/M40 


PRAM 


107-451 


244 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


26-35 


244 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


199-208 


244 


Mitochondrial carrier proteins 


PFAM 


5-84;87- 
175;178-272 


244 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


12-36 


244 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


13-36 


244 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


131-144 


245 


Leucine zipper pattern 


PROSITE 


371-392 


249 


Leucine zipper pattern 


PROSITE 


20-41 


251 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


26-35 


1 


Mitochondrial carrier proteins 


TJT7 A A/t 

rrAM 


J- 1 L 


TCI 

25 1 


Mitochondrial energy transfer proteins. 


TJ>T t~\r~^V C "DT T TC 

DLUtKorLU b 


1 1 16. 

IZoo 


251 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


13-36 


254 


Pancreatic ribonuclease family signature 


T"»T» f~\C^ TTT7 

PROSITE 


63-69 


254 


Pancreatic ribonucleases 


PFAM 


26-143 I 


254 


PANCREATIC RIBONUCLEASE FAMILY 


BLOCKSPLUS 


49-69 


i 254 


Pancreatic ribonuclease family proteins. 


BLOCKSPLUS 


115-140 


254 


PANCREATIC RIBONUCLEASE FAMILY 
SIGNATURE 


BLOCKSPLUS 


92-1 10 


254 


PANCREATIC RIBONUCLEASE FAMILY 

^TfrXTATT TPF 
olVJlN/\ 1 UlVtl 


BLOCKSPLUS 


1 14-133 


254 


Pancreatic ribonuclease family proteins. 


"DT OP1/CDT T TO 

BLOCKSr LU S 


OA A C\ 

30-40 


'I C A 

254 


T» A \T/^T1 TT A TT/^ T*> TT*>/~\XTT T/^T TT ACT? TT A A A~ TT ~\7~ 

PANCRbA 1 1C RIBONUCLEASE FAMILY 
SIGNATURE 


T"»T /^V/^T/CnT T TO 

BLOCKSPLU b 


11/1 m 

1 14-137 


254 


T> A TVT/^TJ T7 A TT/'^ T*> TT~> /^VXTT T/^T E A CE TT A A vf TT "V 

SIGNATURE 


"DT APT/ODT T TO 


oy-oo 




L-lactate dehydrogenase active site 




900 9/1^ 


255 


lactate/malate dehydrogenase 


PFAM 


71-380 


255 


L-lactate dehydrogenase proteins. 


TZ>T OT3T T TO 


1 00-ZZ4 


9^ 


SIGNATURE 


DLULJVorLUo 


1 9 1 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


71-102 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


238-256 


255 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


183-203 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


288-323 
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255 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


207-224 


255 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


71-92 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


138-167 


256 


lactate/malate dehydrogenase 


PFAM 


71-124 


256 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


96-121 


256 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


71-102 


256 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


71-92 


256 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


71-100 


256 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


71-84 


257 


Leucine zipper pattern 


PROSITE 


155-176 


259 


HORMA domain 


PFAM 


22-230 


261 


Leucine zipper pattern 


PROSITE 


142-163 


261 


Leucine zipper pattern 


PROSITE 


170-191 


263 


Leucine zipper pattern 


PROSITE 


15-36 


264 


Ubiquitin family 


PFAM 


1-82 


264 


Ubiquitin domain proteins. 


BLOCKSPLUS 


17-62 


264 


Ubiquitin domain proteins. 


BLOCKSPLUS 


21-68 


264 


Ubiquitin domain proteins. 


BLOCKSPLUS 


26-68 


264 


Ubiquitin domain proteins. 


BLOCKSPLUS 


17-68 


266 


u-PAR/Ly-6 domain 


PFAM 


60-119 


266 


Squash family of serine protease inhibit 


PFAM 


32-47 


267 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


185-202 


271 


LBP / BPI / CETP family signature 


PROSITE 


28-60 


271 


Pyrokinins signature 


PROSITE 


324-328 


271 


LBP / BPI / CETP family 


PFAM 


10-479 


271 


LBP / BPI / CETP family proteins. 


BLOCKSPLUS 


72-118 


271 


LBP / BPI / CETP family proteins. 


BLOCKSPLUS 


209-253 


271 


LBP / BPI / CETP family proteins. 


BLOCKSPLUS 


28-58 


271 


LBP / BPI / CETP family proteins. 


BLOCKSPLUS 


275-309 


271 


LBP / BPI / CETP family proteins. 


BLOCKSPLUS 


76-113 


272 


Zinc finger, C3HC4 type (RING finger), 
signature 


PROSITE 


102-111 


272 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


87-129 


272 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


102-111 


273 


Zinc finger, C3HC4 type (RING finger), 
signature 


PROSITE 


30-39 


273 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


15-57 ! 


273 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


30-39 


274 


RNA 3'-terminal phosphate cyclase signature 


PROSITE 


157-167 


274 


RNA 3 '-terminal phosphate cyclase 


PFAM 


1-368 
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274 


RNA 3 '-terminal phosphate cyclase proteins. 


BLOCKSPLUS 


12-44 


274 


RNA 3 '-terminal phosphate cyclase proteins. 


BLOCKSPLUS 


157-168 


275 


Ribosomal L27 protein 


PFAM 


31-86 


111 


Cell attachment sequence 


PROSITE 


292-294 


277 


DHHC zinc finger domain 


PFAM 


140-204 


279 


Endogenous opioids neuropeptides precursors 
signature 


PROSITE 


26-65 


279 


Vertebrate endogenous opioids neurope 


PFAM 


3-257 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


100-126 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


209-237 | 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


43-66 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


18-38 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


24-36 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


105-125 


280 


Leucine zipper pattern 


PROSITE 


136-157 


280 


Leucine zipper pattern 


PROSITE 


272-293 


283 


Immunoglobulins and major histocompatibility 
complex proteins signature 


PROSITE 


380-386 


283 


Immunoglobulin domain 


PFAM 


205-285;318- 
384 


283 


Immunoglobulins and major histocompatibility 
complex proteins. 


BLOCKSPLUS 


319-336 


I 284 


Fucosyl transferase 


PFAM 


70-406 


285 


FAD/NAD-binding Cytochrome reductase 


PFAM 


27-149 


285 


Oxidoreductase FAD/NAD-binding domain 


PFAM 


176-290 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


58-86 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


75-86 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


274-283 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


141-156 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


274-286 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


60-85 


285 


| CYTOCHROME B5 REDUCTASE 
i SIGNATURE 


BLOCKSPLUS 


181-198 


285 


FLAVOPROTEIN PYRIDINE NUCLEOTIDE 
CYTOCHROME REDUCTASE SIGNATURE 


BLOCKSPLUS 


181-197 
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286 


Immunoglobulins and major histocompatibility 
complex proteins signature 


PROSITE 


380-386 


286 


Immunoglobulin domain 


PFAM 


205-285;318- 
384 


286 


Immunoglobulins and major histocompatibility 
complex proteins. 


BLOCKSPLUS 


319-336 


287 


Leucine zipper pattern 


PROSITE 


126-147 


288 


Leucine zipper pattern 


PROSITE 


20-41 


291 


Tissue inhibitors of metalloproteinases 
signature 


PROSITE 


24-36 


291 


Tissue inhibitor of metalloproteinases 


PFAM 


22-199 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


21-46 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


106-148 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


81-95 j 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


61-72 


294 


Domain of unknown function DUF59 


PFAM 


31-135 


296 


Immunoglobulin domain 


PFAM 


141-197 


297 


TonB-dependent receptor proteins signature 1 


PROSITE 


1-42 


298 


Fibroblast growth factor 


PFAM 


48-129 


299 


BolA-like protein 


PFAM 


39-114 


299 


PROTEIN BOLA TRANSCRIPTION 
REGULATION AC. 


BLOCKSPLUS 


68-98 


| 301 


Cell attachment sequence 


PROSITE 


172-174 


303 


RiHrwomal T 27 nrotein 


PFAM 


31-115 


304 


T eucine rich reneat C-terminal domain 


PFAM 


173-222 


304 


Leucine Rich Repeat 


PFAM 


92-115;116- 
139;140- 

lOJ, 1 OH- Ioj 


309 


Leucine rich repeat C-terminal domain 


PFAM 


173-222 


309 


Leucine Rich Repeat 


PFAM 


92-1 15; 1 16- 

1 -1Q. 1 A(\ 

163;164-185 


311 


NOLl/NOP2/sun family 


PFAM 


201-276;353- 
378 


311 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


230-245 


S 311 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


231-245 


I 312 


NOLl/NOP2/sun family 


PFAM 


201-276 


312 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


230-245 


312 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


231-245 


314 


Leucine zipper pattern 


PROSITE 


8-29 


315 


Leucine zipper pattern 


PROSITE 


8-29 


341 


Immunoglobulin domain 


PFAM 


45-112 


349 


CDP-alcohol phosphatidyltransferases signature 


PROSITE 


54-76 


349 


Cytochrome b/b6 Qo site signature 


PROSITE 


97-102 


354 


SAM domain (Sterile alpha motif) 


PFAM 


82-147 


361 


Ribosomal Proteins L2 


PFAM 


96-124 


368 


I DAD family 


PFAM 


1-78 


370 


Ribosomal protein L34 


PFAM 


51-92 
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385 


Kelch motif 


PFAM 


20-66;68- 

1 1 A ■ 1 1 C 

1 14; 1 lo- 
162; 164- 
209-211- 
265;270-316 


386 


SPRY domain 


PFAM 


85-205 


388 


PHD-finger. 


BLOCKSPLUS 


329-339 


389 


Eukaryotic thiol (cysteine) proteases histidine 
active site 


PROSITE 


268-278 


389 


Heat shock hsp70 proteins family signature 3 


PROSITE 


332-346 


389 


Hsp70 protein 


PFAM 


3-509 


390 


Eukaryotic-type carbonic anhydrase 


PFAM 


20-59 


391 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-162 


392 


Seel family. 


BLOCKSPLUS 


89-107 


393 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-182 


394 


Myc-type, 'helix-loop-helix' dimerization 
domain signature 


PROSITE 


13-28 


395 


Glutathione S-transferases. 


PFAM 


47-122;260-309 


396 


Transmembrane 4 family signature 


PROSITE 


112-134 


396 


Transmembrane 4 family 


PFAM 


66-273 


| 396 


Transmembrane 4 family proteins. 


BLOCKSPLUS 


108-146 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


129-151 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


108-127 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


247-274 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


129-150 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


128-154 


397 


ATP/GTP-binding site motif A (P-loop) 


PROSITE 


6-13 


397 


ADP-ribosylation factor family 


PFAM 


2-172 


398 


Isochorismatase family 


PFAM 


17-147 


399 


PAP2 superfamily 


PFAM 


19-175 


400 


Zinc carboxypeptidases, zinc-binding region 2 
signature 


PROSITE 


117-127 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


36-57 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


73-93 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


114-134 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


145-165 


401 


Zinc finger, C2H2 type 


PFAM 


34-5 7;71- 
93;112-134;143- 
165 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


145-162 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


114-131 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


73-90 


402 


Zinc finger, C2H2 type, domain 


PROSITE 


113-133 
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402 


Zinc finger, C2H2 type, domain 


PROSITE 


144-164 


402 


Regulator of chromosome condensation 
(RCC1) signature 2 


PROSITE 


65-75 


402 


Zinc finger, C2H2 type 


PFAM 


111-133;142- 
164 


402 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


144-161 


402 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


113-130 


403 


Glutathione S-transferases. 


PFAM 


47-122;260-309 


405 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-182 


406 


WD domain, G-beta repeat 


PFAM 


267-304;333- 
370 


408 


Rhomboid family 


PFAM 


186-323 


410 


Ank repeat 


PFAM 


47-79 


410 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR. 


BLOCKSPLUS 


78-89 


410 


Ank repeat proteins. 


BLOCKSPLUS 


48-56 


412 


Serine proteases, subtilase family, aspartic acid 
proteins. 


BLOCKSPLUS 


165-178 


414 


Sir2 family 


PFAM 


84-268 


416 


Kelch motif 


PFAM 


20-66;68- 
1 14; 1 16- 
16?- 164- 
209;211- 
265 270-3 16 


418 


Zinc-binding dehydrogenases 


PFAM 


16-313 


426 


Leucine zipper pattern 


PROSITE 


144-165 


447 


Cytochrome c family heme-binding site 
signature 


PROSITE 


19-24 , 


447 


Immunoglobulins and major histocompatibility 
complex proteins signature 


PROSITE 


17-23 


453 


eIF-6 family 


PFAM 


3-103 


454 


Cell attachment sequence 


PROSITE 


226-228 j 


456 


Leucine zipper pattern 


PROSITE 


211-232 j 


457 


Leucine zipper pattern 


PROSITE 


236-257 


466 


Zinc finger, C3HC4 type (RING finger), 
signature 


PROSITE 


56-65 I 


466 


SPRY domain 


PFAM 


375-500 


466 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


41-81 


466 


B-box zinc finger. 


PFAM 


110-153 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


359-381 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


443-457 ! 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


359-380 


466 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


56-65 


479 


UBX domain 


PFAM 


329-408 


481 


TBC domain 


PFAM 


65-171 


481 


Probable rabGAP domain proteins. 


BLOCKSPLUS 


153-159 


482 


TBC domain 


PFAM 


65-177 
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| 482 Probable rabGAP domain proteins. 



153-159 | 



BLOCKSPLUS 
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Table VII 



Seq Id No 


Epitopes 


242 


98..109;1 19..127;136..147;156..170;242..248;255..265;318..32 
8;356..363;399..407;443..450;475..490 


243 


3..9;59..65;69..79;113..126;142..155;193..198;212..220;231..24 
5;302..315 


244 


29..36;33..42;79..87;139..147;269..274 


245 


101..107;141..151;156..165;196..207;225..233;242..251;253..2 
60;284..298;323..330;339..347;395..406 


247 


41..51;108..120;121..131;190..200;255..261;302..307 


248 


5..11;38..46;52..60;75..83;92..99;133..150;167..183;187..200;2 
10..219;244..252;270..286;335..345;354..371;390..397 


249 


68..80;91 ..99; 132. .138;185..193;265..273;276..293;295..306;30 

^ 1A1*1.A~7 1.^%*1QA AC\1 


250 


28..37;60..67;73..81 


1 

25 1 


33. .45;64.. 71 


252 


20..30;35..45;49..59;74..83 


253 


3..9;59..65 


254 


22..33;35..52;53..67;70..77;80..100;106..117;142..147 


255 


116..123;147..156;201..208;262..278 


256 


10..15;116..121 


257 


41..51;52..66;72..80;94..101;120..127;134..147;180..193;204..2 
10;227..240 


258 


147..157;189..199 ! 


259 


52..59;66..76;103..113;1 15..127;131..140;143..148;181..199;24 
2..250;253..262;262..273;279..289;330..341;342..366;373..394 


s~ r\ 

260 


94..107;1 12..1 19; 125. .134 


261 


1 O 1 1 O /T. AAA 1 CO.OI 1 O^/l 

121..126;144..152;21 3.. 224 


263 


44. .50 


264 


51..58;82..90;153..164 


266 


15..20;38..49;76..81;95..105 


267 


74..91;94..99;1 17..l30;l40..154;l53..l6l;175..184;20l..2l0;22 
8..240;250..255 i 


268 


36..42;43..54 


269 


41..46;64..73;80..100;106..122;160..172 


270 


38..48;82..88 


271 


34..40;72..79;lll..l23;l46..l53;25l..259;307..314;3l6..322;37 
2..377;436..444 


272 


I2..l7;5l..58;75..85;l28..l36 ; 


273 


A 1 T CZ" £Z A 

4..13;56..64 


274 


34..46;120..127;157..163;182..191;231..240;259..267;273..279; 

OAi OQQ.1/1/1 1^^ 


275 


30..55;72..78 


276 


27..35;37..45;49..61;61..77;102..109;144..152;170..180;179..18 
8 ! 


277 


61..67;147..152;154..166;284..299;308..313 


278 


72..82;451..461;532..541 


279 


24..31;72..84;83..92;97..111;144..149;161..182;181..189;192..1 
98;204..2 14;2 1 6..233 ;24 1 ..254;256..263 
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280 


5.20:50 s^^i ^,204 .2 12:25 1..260;354..362 


281 : 


>4..38;44..52 


282 


72..82;451..461;532..541 


283 


1..6;21..31;77..83;115..l20;228..237;276..281;335..343;401..40 
7;440..456;456..468 


284 


% 9-39..47;50..66;78..94;111..122;132..141;169..174;190..202; 
213'..220;243..252;261..274;282..300;369..376;379..389;395..4 

03 


285 


29..38;42..47;58..65;100..110;121..134;156..161;161..173;201.. 
207;230..239;243..254;29O..302 


286 


1..6;21..28;77..83;115..120;228..237;276..281;335..343;401..40 

7 - - 


287 


2.. 10;94.. 1 04;248..258;268..286 


288 


68..80;91..99;132..138;185..194;262..270;273..288;291..301;30 
0..324;322..336;342..353;389..398 


289 


23. .38 


291 


28..35;96..104;134..144;159..167;177..187;191..198 


292 


1..7;56..64;66..73;77..92 


| 293 


40..45;99..109 


294 


47..57;120..126 


295 


31..61;76..82;143..149;156..169 


296 


133..143;151..156;161..167;169..181;185..194 


297 


50..58;59..69;113..123;120..137 


298 


45..55;52..63;106..117;118..128;126..131;148..155;157..164;l/ 
2..190;212..221;232..247 


299 


51..59;82..87;113..125;124..135 


1 300 


72..82;451..461;532..541 


301 


43..52;88..105;192..211;255..271 


302 


3..18;37..44;57..65;70..76;98..113;121..134 


303 


30..55;72..77;82..88;106..1 13 


304 


2..11;33..42;48..54;55..63;122..131;147..154;168..180;200..209 
;211..220;226..233;268..278;286..291 


305 


22..31 


306 


5..11;25..35;72..81;124..134;147..157;163..178;177..186;185..1 
95;207..217 


307 


23..38 


308 


66..72;84..100 


309 


2..11;33..42;48..54;55..63;122..131;147..154;168..180;200..209 
;21 1 ..220;226..233;268..278;286..291 


310 


45 52;60..68;88..94;99..109;113..120;121..134;162..171;169..1 
84;194..202;209..215;223..235;239..248;273..281;292..301;319 
..329;336..341;389..394;398..405;421..426 


311 


15..21;28..35;82..91;113..120;125..133;153..167;236..243;291.. 
298;307..312;316..327;390..396;406..413;436..457 


312 


15..21;28..35;82..91;113..120;125..133;153..167;236..243;291.. 
298;307..312;316..327;352..370;370..382 


313 


38..46;52..60;75..83;92..99;133..150;167..183;187..200;210..21 
9:239. .256 


314 


36..42;52..58;65..70;80..87;143..155;161..168;176..185;203..20 
8;263..272 


315 


36 .42;52..58;65..70;80..87;143..155;161..168 


316 


^..47:49..58;106..117;125..132 
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317 


45..53;54..68;88..94;99..109;113..120;121..134;162..171;169..1 
84;194..202;209..215;223..235;239..248;273..281;292..301;319 
..329;336..341;389..394;398..405;421..426 


J lo 


/ii c/i./ca n. 1 i/i im.i/io ici.ic/c i/r .10/ ti a,oi a m.n 
4z..d4;69..77;1z4..13U; 14o\.l jj; 1 jo.. Ioj; loo.. zUU;z^ v.. zz /;z / 

1 iq/c«iqi inn 

1 ..ZoO,ZVJ\. JUU 


1 1 G 

j iy 


ic n«ic i7«i*c co*co ^/L«cn on-o^c ini'inc 11Q 

1 J..Z1,Zj..J / ,JO..j7,jo..04,oU..oy,o0..1UZ, 1UJ..1 ly 


i in 
5 IK) 


1 CO 

1 ..o,o 1 ..oy 


i i 1 
Jz 1 


111 1 1 

1 1 1 .. 1 lo 1 


m 
jzz 


1 11 .CA /CO.1/^ OC 

I ..z 1 ; jU..oo; /o..oj 


JZJ 


1 1 1 /C. /lO /CO 

1 1 .. lo;4y..oo 


IOC 

JZJ 


1/1 in«/in CC./CQ 1/C« 1 11 ill .HQ iiQ'KQ 1 /C/C 

14..zU,4U.. J J,oy.. /O, IZZ.. lJl,lZo..lJo,ljo.. loo 


JZO 


10 /IC-/C1 /CO* 01 QQ.1 in 1/11.1/11 1/IQ 
1o..4j,o1 ..Oo,o 1 ..oV, 1 1U.. 141 , 14Z.. 14V 


in 


/II /I0«C1 /C1-0C GC 
4j..4o, j j..OZ,oj..Vj 


JZo 


A G. 1 A /I /C 

4..y; J4..40 


jzy 


1 7»CG /CO 
1.. /, JO..OV 


330 


34..42;46..52;56..66 


331 


r r\ y r\ t - * o /"\ r™>/\ f™»/~v /"v/\ AO -i -i f\ -i -i /- 

59..69;72..80;80..89;90..98;l 10..1 15 


332 


14..24;50..56 


333 


I..6;4l..47;86..96;l20..l34 


334 


I7..23;4l..47;50..57;85..90;96..l05;l54..l59;l8l..l92;l92..l98 


J J J 


/I G-7 1 1 Q 

4..y,zl..zo ; 


'J 1 /C 

55o 


/io /^c.in on 

4V..OJ; /U..oU j 


in 

551 


in in.i/c /ii. ci /ci .ha oi.iin iio.iic 110.117 1 /ii 
zU..jU;jo..4z;j j..ol ; /4..oj; llU..liy;lzj..lJo;lj/..l4z 


1 1 o 
J Jo 


11 cc.cc /cc./ci 01.00 inn.no 1 ni 
zl ..j j; j j..oj;oz..o1 ;oo.. 1UU;W.. 1U / 


1 lO 

55y 


11 ah- ci /cn-n 01 
J J..4 / , JZ..0U, /J..OZ 


1 A (\ 

J4U 


c 11.11 11.11 C 1 ./CC 11 

j..lz; 1 j..zl,jz..j l,oj../z 


1/11 

J41 


10 in./i/i ci.ci /ci./co oi.o/c 1 no 
iy..j(J,44.. jz, j 1 ..ol ,Oo..oZ,Vo.. lUo 


j4z 


10 1/^-/1/1 C/1.11 "71 

1o..zo;44..j4; /[..// 


j4j 


11 11. ci /c/c./co 11. on 00.01 00 

zl ..j j;j /..oo;oo.. / /;oU..oy;v 1 ..Vo 


J 44 


1/C /1/I.OQ QC110 1 1 <c.i 01 1 Q1 

jo..44;oy..yj; 1 iu..i io,ioi..iyj 


i /i c 
j4j 


10 10. i/i /11./11 ccon 0/C.1 10 ii/i.ico 1 /cc 1 /ci 1 i/c 
iy..zo,j4..41,4J..JJ,oU..oO, 1 IV.. Iz4, 1 jy..loj, I0/..I /O 


i /i /c 
J4o 


i/c iciq /10.00 QC.11/C 1/icini 111 

ZO..J j, jy..4y,oo..yj, 1 jo..14j,zu/..zi / 


1/17 

J4 / 


1 1 n«i/c ii'Ci /co«7c o/c 
Z.. 1U,Z0.. JZ, JZ..O0, / J..00 


1/1 c 
J4o 


71 77-Q7 0Q.1 1/1 11C 

/ 1../ /,oz..oV,l 14..1ZJ 


1 /lO 

J4V 


c/i /c/c.m 7/^«io/i mi 
j4..oo; /U.. /o,zV4..jUz 


i cn 
J jU 


1 /C 1C-11 17-/CQ on 

1o..Zj, J 1 .. J / ,oy..oU 


i c 1 


/CO 11.110 110.11C 1 A O 

00.. / j; 1 lo..lz", Ijj.. 14o 


i ci 
5 jz 


/CO 11.110 110.11c 1/10 
00../J,! lo.. IzV, 1 jj.. 14o 


i c i 
J j J 


1 A 1 Q./C1 /CQ./CO 77 

iu..io,oi ..oy,oo.. / / 


1 C A 

J j4 


11 in./i/i ci«ci oc.o/i ini.i/ii ici 
1 j..zU,44..j 1, j / ..oj,y4.. lUz, 14J.. 1 j 1 


i c c 

jjj 


10 in. 10 /ii. /ii c/i 
1o..jU,jo..4j,4j..j4 


1C*; 
J JO 


1 ^C'/in /lC'17fl 1 7^C« 1 7Q 1^/1*1 CI 1 
I ..0,4U..4J, 1ZU.. 1 zo, ily .. 1 J4, 1 J J .. 1 oz 


1C7 


AO CO* 111 1/11 «1C7 1 #C7 
Hy .. J7, 1 J J .. 14 1 , 1 JZ.. IO/ 


ICO 


0 7/c.en oc-o/c 1 ni 

7..ZO,OU..OJ,70.. 1 UZ 


1 co 
5jy 


QO 1 f\A .10 1 1 07 
yo.. 1U4, 1 0 1 .. 1 0 / 


i/cn 
JoU 


1 /C./I/1 C/1'111 171-1/17 1/^1 

1 ..o,44.. j4, 11j..1zj,14/..1o1 


JO 1 


fsA lA-11 00*119 1ZI1 

OH.. /H, / / ..y\J^ 1 1Z..1H1 


362 


20..30;37..42;53..61;74..83;110..119;125..134;138..147 


363 


1..11;26..31;53..60;97..105;110..117;141..146 


364 


16.. 24 | 


365 


7..15 i 
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367 


8.. 14 j 


368 


16. .23 


369 \ 


5..14;59..71;74..83 


370 ; 


39..45;45..67;82..91 


1 "7 1 

371 


">*7 /i/T.oi o/t.ao rift. i /in 1 c/c. 1 /cj 1/cn.nn 1 O/i.oAn ti /c.o/n 

37..46;81..86;92..99; 140..156;163..169;179..184;209..216;242.. 

T^O-T^C 1^0 
ZJZ,ZDO..ZOO 


j /z i 


*71 C1.Q1 1 A/1 - 1 /I 1 1/1^ 

/Z..o 1 ;V3.. 104, 141 .. 140 


3 /3 


*7 1 T7.Q/: m 

/ 1 .. / /;oo..93 


374 ' 


1 *7.11 >i*7./1*7 C*7.CO 

l../;31..4/;4/..5 /;59..o5 


3 / 5 


x. io;55..o0;o3..90 


1 *7/C 

3 /o 


34..43;4U..4o;o /..o 1 


1 *7*7 

3 / / 


1*7 1Q./1Q /Z/I./CQ 7Q.1 1 1 111 

z / ..3o;4y..Do; j4..o4;oo.. /o; 1 1 i .. izz 


1*7C 

3 /o 


Q 1/C-QA O^Q^ 1 A1 

y..zo,ou..oj,yo.. iuz 


3 /V 


A 1 eA.CI cq.iqc Ort/I.l 1 Q 11*7.11*7 1/1*7 

4 1 .. jU,jz..jo, iyj..ZU4,3 1 o..3Z / ,33 / ..34 / 


3oU 


y©.. 1U3 


381 


46..51 
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128 


Br:l 


129 


Br:2;FB:6;Li:l;SG:3;Te:2 


! 130 


Br:25;FB:3;FL:2 


! 131 


Br:l 


1 132 


Br:l 


! 133 


Br:l ! 


! 134 


Br:2;SN:l 


135 


Br:l 


136 


AG:l;Br:l;FL:l 


137 


Br: 1 ;Ce: 1 ;FB: 1 ;FK: 1 ;FL:2;P1:3;SN: 1 ;Te: 1 ;UC: 1 \ 


138 


Br:43 


| 139 


Br: 1 1 ;CP:2;Co: 1 ;DM:6;FB: 1 ;FK:6;He:2;Ki:4;LC: 1 ;LG: 1 ;Ov:40;Pa: 1 ;Pl:2;Pr: 1 ;SN: 
2;Sp:l;Te:9;UC:l;Ut:3 


140 


Br:23;Ce: 1 ;DM:3;FB:38;FK: 17;FL:2;HP: 1 ;He: l;Ki:8;LC:3;LG:2;Li:6;Lu: 1 ;Ly: 1 ;0 
v:40;Pr:4;SC:2;SN:4;Sp: 1 ;Te:5;UC: 1 ;Ut: 1 


141 


Br:39;FB:3;SN:2 


142 


Br:10;SN:2 


143 


Br:26;FK:2;HP: 1 ;LC: 1 ;Li:2;Ov: 14;Pl:3;Pr:3;Te:5 


144 


Br:14;Pr:2 


145 


FB: 12;LG: 1 ;Pr:4;Te: 1 ;Ut:2 


146 


Li:l;Ov:2;Pr:5;SG:ll 


147 


Li:l;Te:l 


148 


Br:l;FB:l;Li:l;Te:l 


149 


Br:3;FB:5;FK:5;Li:l;Pl:8;Te:5 


150 


FK:6;Pr:2;SG:8 


151 


FK:9 


152 


FK:6;Pr:2;SG:9 j 


153 


Te:l ! 


! 154 


FB:28;Ov:4 


1 155 


Br:21;Ce:l;FB:32;FK:4 


156 


Br:5;CP:l;FB:16;FK:3;He:l;Ki:5;Li:l;Ov:15;Pl:3;SG:2;SI:l;Sp:l;UC:l 


157 


FB:14;FK:1;FL:1;SG:1 


158 


FB:7 


159 


FB:10 


160 


Ce:2;FB:12 


161 


Ce:2 


162 


FB:28;Ov:2 


163 


FB:14;FK:1;FL:1;SG:1 


164 


FK:4;Pr:l;SG:9 


165 


Br:4;Co: 1 ;Ki: 1 ;Ov:2;Pr:4;SG: 1 


166 


FK:6;Pr:2;SG:9 


167 


Br:l;FB:l;SG:5 


168 


Br: 1 ;FB:5;FK:7;SG: 1 ;UC: 1 


169 


FK:2 


170 


FL:12 
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171 


Br:2;FB:l;FK:l;Pl:7 


172 


Br:106;FB:2;Pl:7 


173 


Br:14;FB:l;Pl:2;Te:l 


174 


Br:17;He:l;Pl:l;SC:2;Te:l 


1 175 


Br:14;Pr:2 


176 


Br:106;FB:2;Pl:7 


177 


Br:114;FB:7;FK:7;Ov:2;Pl:7;Pr:2;Te:9 


178 


Br: 16;CP:2;FB:2;FK:2;FL:l;Li: 1;P1: 13;Pr:3;SC: 1 ;Ut: 1 


179 


FL:l;HP:2;Pr:2;Te:l 


i 180 


Pr:2 


| 181 


FB:l;Ov:2;Pr:l;UC:l j 


182 


BM: 1 ;Br:4;DM: 1 ;FB:6;FK:6;Ki:5;LC:2;LG: 1 ;Li: 1 ;Lu: 1 ;Ov: 1 5;P1: 1 ;Pr:2;SC: 1 ;Sp:2; 
Te:2;Ut:l 


183 


Br:8;CP: 1 ;Co:2;DM:4;FB: 1 ;FK: 1 ;Ki:4;LC: 1 ;Li:3;Ov:33;Pl: 1 ;Pr:5;SC:2;SN: 1 ;Sp: 1 ; 
Te:5;UC:l;Ut:2 


184 


Pr:l 


185 


FB:2;Li:l;Ov:l;SG:7;Te:5 


186 


Te:3 


187 


Te:l 


188 


Br: 1 8;CP: 1 ;DM:5;FB:40;FK:23;FL:2;He:3;Ki: 10;LC:2;LG: 1 ;Li: 1 3;Lu:3;Ly:2;Mu: 1 
;Ov:54;Pl:5;Pr:14;SC:2;SG:2;SI:2;SN:4;Sp:3;Te:4;UC:4 


189 


Li:l;Te:l 


190 


Br:7;CP: 1 ;FB: 1 ;FK:4;FL:5;He: 1 ;Li: 1 ;Ov: 1 ;Pl:2;Pr:4;SG: 1 


191 


Li:2;Te:4 


192 


AG: 1 ;Br:2;CP: 1 ;FB:32;FK: 1 ;Li: 1 ;Ov:36;Pl:49;Pr:3;SC: l;SG:4;SN:4;Te:9;UC: 1 ;Ut: 
2 


193 


FB:31;FK:75;FL:7;Ov:12;Pl:23;Pr:8;SG:3;Te:16 


194 


Te:2 


195 


Te:7 


196 


Te:2 


197 


Te:3 


198 


Li:10;Te:43 


199 


Br:35;CP:3;FB:39;FK:56;FL:7;HP: 1 ;LG: 1 ;Li: 1 ;Ly: l;Ov:2;Pl: 10;Pr:8;SG: 1 ;Te:4;Ut: 
2 


200 


FB:17;FK:9;FL:5;Ov:21;Pl:41;Te:3 


201 


FK:16;SI:1 


202 


Br: 1 ;Co: 1 ;FB: 1 1 1 ;FK:25;He: 1 ;Li:4;Ov:3;Pr:6;Te: 1 


1 204 


Te:7 


205 


Li:7;Te:28 


i 206 


FB:28;Li:2;Ov:23;PG: 1 1 ;P1:45;SG: 1 7;SI: 1 1 ;Te:9 


j 207 


FB:16;FK:l;Ov:l;SC:l;Te:l 


208 


FB:5 


209 


FB:6 


210 


Br:l;FB:22 


211 


Br:2;Ce:3;FB:6;FK:l 


212 


Br: 1 ;Co:2;FB:22;FK:2;LG:2;Mu:2;Pl:2;SG:4 
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213 


Br:2;DM: 1 ;FB:8;FK:8;FL: 1 ;Ki: 1 ;LG:3;Ov:5;Pa: 1 ;Pl:4;Pr: 1 ;SN:2;UC: 1 


214 


FB:7 


215 


FB:4 


! 216 


Ov:3;SG:3 


217 


Br:4;CP:2;DM: 1 ;FB:9;FK:3;Ki:2;LC: 1 ;LG: 1 ;Lu:3;Ly: 1 ;Ov: 14;P1: 1 ;Pr: 1 ;SC: 1 ;SG:2 
;Sp:l;Te:l;Ut:l 


218 


FB:4;FK:2;Pl:l;Pr:ll;SG:l ( 


219 


Br:7;CP:3;FB:2;FL: 1 ;HP:4;Lu: 1 ;Ly:2;Mu: 1 ;Ov:3;Pl: 1 ;Pr: 1 ;SN:2;Te: 1 


220 


Br:l;FL:l;Pl:2 


221 


Co:l;FB:2;FL:l;Li:l;Pl:2 


222 


FL:1;SG:2 


223 


Li:l;Te:l 


225 


Li: 10 


226 


Li:l;Te:4 I 


227 


Li:l 


228 


Br:l 


229 


Br:3 


230 


Br:5;Ce: 1 ;Co: 1 ;DM:3;FB: 1 ;FK:l;He: 1 ;LC: 1 ;LG:2;Ov: 1 6;Pl:3;Pr: 1 ;Te:2;Ut: 1 


231 


Br:3;Ce: 1 ;Co: 1 ;DM:3;FB: 1 ;FK: l;He: 1 ;LC: 1 ;LG:2;Ov: 1 6;Pl:3;Pr: 1 ;Te:2;Ut: 1 


232 


AG: 1 ;Br: 1 7;CP:2;DM: 1 ;FB:5 1 ;FK:9;FL:3;Li:3;Ov:3;Pl:2;Pr: 10;SC: 1 ;SG:5;Te:2;Ut: 
1 


233 


Br: 13 


234 


Br:5 


235 


Br:l;Pl:l 


236 


Br:9 ; 


237 


Br:22;DM:2;FB: 1 7;FK:9;Ki:4;LG: 1 ;Li: 1 ;Lu:2;Ov:24;Pr:3;SC: 1 ;SI: 1 ;SN:2;Te:2 


238 


Br: 17 


239 


Br: 11 


240 


Br:28;Ce: 1 ;DM:5;FB:52;FK:40;FL:2;HP: 1 ;He:2;Ki:3;LC: 1 ;LG:3;Li: 1 ;Ly: 1 ;Ov:28; 
PI: 1 ;Pr:5;SC: 1 ;SI: 1 ;SN:3;Sp:6;Te: 1 ;UC: 1 ;Ut: 1 


241 


Br:4;Ce: 1 ;DM:5;FB:5;FK:7;HP: 1 ;He:2;Ki:3;LC: 1 ;LG:3;Li: 1 ;Ly: 1 ;Ov:28;Pl: 1 ;SC: 1 ; 
SN:3;Sp:6;Te:l;UC:l;Ut:l 
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Table X 



Seq Id No 


Low frequency 


High frequency 

PVnfOGGlAtl 


i 
i 




131 ,UV 


0 
z 




Pr 

IT I 


"2 
3 


D r Tp 
131, 1 C 


0\/ pn P1 <sT 
uv,ru,n,oi 


A 
H 




A CI 






ra 


o 




ra. 


7 

/ 




131 


o 
o 




OVJ 


Q 

y 




F*\/f 1-To TiTi* Oi7 Pq 
U1V1 , xl c , JVl , ^ V, ra 


iu 




it 


1 1 




ou 


1 o 

1Z 




n 


1 1 

Yd 




T7T T i 


1 A 

14 




T i Tp 


1 c 
1 J 






1 A 

lo 




T i To 

la, i e 


1 7 
1 / 






1 a 

1 o 




T i To 

i»i, i e 


1 o 
iy 




T i To 

iri, i e 


on 
zu 




To 


O 1 
Z 1 




To 


00 
ZZ 




To 


01 




To 


Z4 




T i ! 

1*1 


0<s 

ZD 




To i 


ZO 




To 


07 

Z / 




T To 1 


Zo 




To 


OQ 

zy 


PI 




in 




T i 


1 1 




F1<T 


10 




Pp 


11 






3*+ 


r 13 




1^ 




1 c 






Oil 


37 




SG 


38 




FB 


40 




SG 


41 




SG 


42 




BM,SG 


43 




SG 
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44 


- 


Mu,Pl,SG 


45 


- 


BM.SG 


46 


- 


BM,Ki,Ov 


47 


- 


SG 


48 


FB,FK,Pr 


PI 


49 


- 


Ki.Ov 


50 


Br,FB,FK,SG 


Li,Ov,Te 1 


51 


- 


FL 


52 


- 


FL.LQUC 


53 


- 


PI 


54 


- 


Ki,Ov,Pa,Sp 


55 


- 


FL 


57 


- 


FL | 


58 


- 


FL 


59 


- 


SG | 


62 


- 


Ce,FB i 


63 


- 


Br 


64 


- 


CP 


65 


- 


FB,Th 


66 


FK,SG,Te 


Ov,PG,Pl 


67 


- 


FB,Ki,Lu 


68 


- 


Br 


69 


FB 


Br 


70 


- 


Br,DM,He 


71 


- 


DM.He 


72 


- 


Br 


73 


- 


Br 


74 


- 


Br 


75 


- 


Br 


76 


FB 


Br 


! 77 


FB 


Ov,Pr 


78 


- 


Ki.Ov 


80 


FB 


DM,Ki,Ov 


82 


- 


Li 


83 


- 


Ki,Li,Ov 


84 


- 


Ov 


85 


- 


Li 


86 


- 


Li i 


87 


Br,Pr,SG 


Ov,PG,Pl i 


88 


- 


Te,Ut 


89 


- 


Te 


90 


- 


Te 


91 




Ov 


92 




Te 


93 




SN 


94 




Te 
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95 


- 


Li 


96 


- 


AG 


97 


- 


FK ! 


98 


- 


Te ! 


99 


FK 


Te 


100 


- 


Ov 


101 


FK 


PI 


; 102 


- 


FB 


103 


- 


Te 


104 


FB,Li,SG,Te 


FK 


105 


- 


DM,SN 


106 


- 


FB 


107 


Br,Pl 


FB,FK 


108 


- 


FB.Lu 


109 


- 


Pr 


110 


- 


He,Ki,Ov 


111 


- 


Ce,He,Ki,Lu,Ov 1 


112 


- 


Lu,SG 


113 


- 


HP.SG 


114 


- 


FK 


115 


- 


FK 


116 


FB 


DM,LC,Ov,Ut 


117 


- 


Ce.UC 


118 


- 


Ov.Sp 


119 


- 


Te 


120 


FB 


Co,Pl 


121 


- 


AG,Br 


122 


- 


Br 


124 


- 


Ki.Ov 


125 


- 


FL,Pr,Th 


127 


- 


BM,SC,Ut 


130 


- 


Br 


134 


- 


SN 


136 


- 


AG 


137 




Ce,UC 


138 


FB 


Br 


139 


FB 


DM,Ki,Ov,Ut 


140 


PI 


Ki.Ov 


141 


- 


Br 


142 


- 


Br.SN 


143 


FB 


Br.Ov 


144 


- 


Br 


145 




FB,Ut 


146 




SG 


149 




PI 


150 




FK,SG 
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151 


- 


FK 


152 


- 


FK.SG 


153 


- 


Te 


154 


- 


FB.Ov 


155 


- 


Br,FB 


156 


- 


Ki.Ov 


157 


- 


FB 


158 


- 


FB 


159 


- 


FB 


160 


- 


Ce,FB 


161 


- 


Ce 


162 


- 


FB 


! 163 


- 


FB 


J 164 


- 


SG 


165 


- 


Co,Ki,Ov 


166 


- 


FK,SG 


167 


- 


SG 


168 


- 


FK 


169 


- 


FK 


170 


- 


FL 


171 


- 


PI 


172 


FB,FK,Pr 


Br 


173 


- 


Br 


174 


- 


Br.He.SC 


175 


- 


Br 


176 


FB,FK,Pr 


Br 


177 


FB 


Br 


178 


- 


Br,Pl 


179 


- 


HP 


180 


- 


Pr 


181 


- 


Ov.UC 


182 


- 


Ki,Ov,Sp 


183 


FB 


DM,Ki,Ov 


185 


- 


SG.Te 


186 


- 


Te ! 


187 


- 


Te 


188 


PI 


DM,Ki,Ov 


190 


- 


FL ! 


191 


- 


Te 


192 


Br.FK 


Ov.Pl 


193 


Br 


FK,Ov 


194 


- 


Te 


195 




Te 


196 




Te 


197 




Te 


198 


FB 


Li,Te 
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199 


- 


FK 


200 


Br 


Ov,Pl 


201 


- 


FK 


202 


- 


FK 


203 


Br,Pl 


FB 


204 


- 


Te 


205 


- 


Li,Te 


206 


Br,FK,Pr 


Ov,PG,Pl,SG,SI 


207 


- 


FB 


208 


- 


FB 


209 


- 


FB 


210 


- 


FB 


211 


- 


Ce 


212 


- 


Co,FB,Mu ! 


213 


- 


Ki,LG,Ov 


214 


- 


FB 


215 


- 


FB 


216 


- 


Ov,SG ; 


217 


- 


Ki,Lu,Ov 


218 


- 


Pr j 


219 


- 


CP,HP,Ly,Ov,SN 


221 


- 


Co 


222 


- 


SG ; 


223 


- 


SG ! 


225 


- 


Li 


226 


- 


Te 


227 


- 


Li 


229 


- 


Br 


230 


- 


DM.Ov 


231 


- 


DM,Ov 


232 


- 


FB 


233 


- 


Br 


234 


- 


Br 


236 


- 


Br 


237 


- 


Ki,Lu,Ov 


238 




Br 


239 




Br 


240 


Pl.Te 


DM,FK,Ki,Ov,Sp 


241 


FB 


DM,He,Ki,Ov,Sp 
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Table XI 





oUDceiiuiar localization 


7 


ni ir*1 pcir 


1 ^ 


CAII aCCU Uldl ? UlClUUlIlg CCU Wall 


* 9fl 
zu 


1111 LOCUOIlUridl 


Z 1 


nuciear 


9£ 
zo 


nuciear 






^7 

j / 


enuopia&mic reticulum 


^8 


t» vfro /"» t»l 1 1 1 1 or iripliininfT /"»£»11 ix/oll 
CXudCC! 1U1 al, inClUUing CC11 Wd.ll 


J7 


eiiuopia&mic reucuium 


*T 1 


enuopiasmic reticulum 




cnuopi domic reticulum 


70 
/U 


nuciear 


71 
/ 1 


nuciear 


79 
/Z 


nuciear 


7S 
/o 


nuclear 


Q8 


nuciear 


QQ 

yy 


l l/l 1 Arty" 

nuciear 1 


1 n*; 

I 1UD 


miiocnonuriai 


1 OR 
1U© 


enaopiasmic reucuium 


1 1 £>. 
1 lO 


miiocnonanai 


1 1 7 


miiocnonuriai 


1 'XA 


nuciear j 


1 ^ 
1 jj 


nuclear 


1 ^7 


miiocnonuoai 


1 ^0 

i oy 


nuciear 


1 £0 
1 ou 


nuciear » 


1 ^1 

1 O 1 


nuciear 


1 71 
1/1 


nuciear 


1 78 
1 /o 


onnATilocrMiP f**^! ll 1 it*n 

enuopiasmic reucuium 1 


1 89 
1 6Z 


nuciear 


1 8A 
1 OH- 


nucieai 


1 8S 


f*n H/"\til o cmi r* T^ti 1 1 1 1 1 m 1 

euuopidoiuic iciicuiuiii 


1 8f» 

1 OVJ 


nil r*l f*Q T* 


1 87 


ni ir*1 PQr 
IlUVlCal 


1 88 
1 oo 


llUClCal 


194 


niiplpar 


195 


nuclear 


196 


nuclear 


200 


mitochondrial 


204 


nuclear 


205 


nuclear 


206 


nuclear 
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211 


nuclear 


212 


nuclear 


213 


nuclear 


214 


endoplasmic reticulum 


215 


endoplasmic reticulum 


216 


endoplasmic reticulum 


218 


nuclear 


220 


endoplasmic reticulum 


224 


nuclear 


225 


nuclear 


230 


mitochondrial 


231 


mitochondrial 


238 


cytoplasmic 
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Table XII 



Seq Id No in 
priority 

itnnliriitinnc 


Internal designation 


Seq Id No in 
present 

annliratinn 


1 1 Q 


1 1 q ooi a o P7 

1 1 7-UU D-*+-U-V^Z-V^o 


1 
1 


770 
zzu 


in«; oi£ i o ph r^Q 


z 


I/I** 


i oi^ i n 010 r^Q 

1 UD~U 1 0-j-U-UlU-Lo 


J 


11/1 


1 o^ o7^ 1 o a ^ pq 


A 


1 

l Dy 


1 o<. oi 1 i n ah 




7 1 Q 
Z 1 y 


1 o^ on 7 o p>i r"<i 


O 


7^n 

ZDU 


1 oi*\ 7 o pq 


7 
/ 


Zl / 


017 o n m 1 r^Q 

1UD-UD / -Z-U-rl 1 l-^o 


O 


1/LH 


1 o^ o^i a o ps pq 


Q 

y 


1 1 c 

1 Id 


i ac H7zi i n uin r^Q 

1UD-U /4- D-U-ri lU-^o 


1 O 
1 U 


1 1 

D 1 


inc nsQ i n r^Q 

1 Uj-U07-j -U-UlU-Lo 


1 1 
1 1 


1 yo 


i ac aqc 9 a 1 1 r^Q 


1 7 
1Z 


1 *i/l 
1 D4 


i o^ oo^ 1 o pi r^Q 


1 \ 




1 017 1 o pq r^Q ^*vr 
iuo-ud / - i -u-i^v-t^o .cor 


1 /l 
14 


1££ 
DOO 


i a/: 1 n pq fv- 


1 c; 
ID 


7Q 

ly 


i a/: A/11 /l T-J1 r^Q 
1 U0-U4J -4-U-ri j -L^o 


1 

lO 


Vd 


1 i a aai i a r^n r^Q 
1 1 U-UU / - 1 -U-l^ / -l^o 


1 7 
1 / 




11/1 Ai /; i a uc r^o 
1 14-U 10-1 -U-rio-L,o 


1 8 
1 5 


7/1 

Z4o 


1 1 aa/1 i n a ^ r^c 
1 1 0-UU4-D -U-ZvO-t^o 


1 Q 

iy 


1 S7 


1 1 n^/i i n p^ r^Q 

1 1 0-UD40-U-J^0-V^i3 


70 

zu 


701 


1 1^ o^ 1 o ai r^Q 

1 lO-UDD-1 -U-/\D-l^o 


7 1 
Z 1 


7Q£ 


1 1^ o^ 7 o P7 r^<v 

1 1 0-UDD-Z-U-r l-V^tj 


77 
ZZ 


777 

All 


1 1^ nsfi /i n aq r^Q 


71 

Zj 


A\ 
4 1 


1 1 o-uy 1-1 -u-ijy-Lj 


7/1 
Z^t 






75 


! / o 


11^111 10 T-IQ r"<i 
1 lO-l 1 1 - 1 -V-riy-K^tj 


7£ 
zo 


Z4D 


1 1 ^ 1 1 1 -A 0 r^Q 

llO-ll 1 -4-U-r>j-V^i3 


77 
Z / 


1 Ozl 
1 U4 


11^ 1 1 ^ 90 PR r^Q 

1 lO-l 1 j-Z-U-ro-Lo 


7R 1 
ZO 


zdv 


11^110 10 T-T^ r^Q 
1 lO-l l7-j-U-rlj-v^i3 


7Q 1 

Z7 


7£Q 

zoy 


1 1 7 001 ^ 0 01 

1 1 / -\J\J 1 - J>-U - 0-3 _ v^o 


10 ! 




14S ?S ^-0 R4 rr%r 


D 1 


1 & 
1 oo 


14^ 1 0 VIA P^ fr 


^7 


1 £Q 
i \jy 


i/ic c^ ^ n r 1 ^ 




117 




Id 


! 771 


i 1 57-1 S-4-0-R1 1-CS 


^5 

J*/ 


190 


160-103-1-0-Fll-CS 


36 


244 


160-37-2-0-H7-CS 


37 


151 


160-58-3-0-H3-CS 


38 


149 


1 160-75-4-0-A9-CS 


39 


307 


| 174-10-2-0-F8-CS 


40 


264 


174-33-3-0-F6-CS 


41 
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168 


1 74-38-1 -0-B6-CS 


42 


202 


174-38-3-0-C9-CS 


43 


28 


174-39-2-0-A3-CS 


44 


331 


174-41- 1-0- A6-CS 


45 


258 


174-5-3-0-H7-CS 


46 


84 


174-7-4-0-H1-CS 


47 


294 


175-l-3-0-E5-CS.cor 


48 


294 


175-l-3-0-E5-CS.fr 


49 


310 


180-19-4-0-F4-CS 


50 


311 


181-10-1-0-DlO-CS 


51 


263 


181-16-1-0-G7-CS 


52 


304 


181-16-2-0-A7-CS 


53 


109 


181-20-3-0-B5-CS 
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1 . An isolated polynucleotide, said polynucleotide comprising a nucleic acid sequence 
encoding: 

5 i) a polypeptide comprising an amino acid sequence having at least 

about 80% identity to any one of the sequences shown as SEQ ID 
NOs: 242-482 or any one of the sequences of polypeptides encoded 
by the clone inserts of the deposited clone pool; or 
ii) a biologically active fragment of said polypeptide. 

10 

2. The polynucleotide of claim 1 , wherein said polypeptide comprises any one of the 
sequences shown as SEQ ID NOs:242-482 or any one of the sequences of the polypeptides encoded 
by the clone inserts of the deposited clone pool. 

15 3. The polynucleotide of claim 1 , wherein said polypeptide comprises a signal peptide. 

4. The polynucleotide of claim 1 , wherein said polypeptide is a mature protein. 

5. The polynucleotide of claim 1, wherein said nucleic acid sequence has at least about 
20 80% identity over at least about 100 contiguous nucleotides to any one of the sequences shown as 

SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of the deposited clone pool. 

6. The polynucleotide of claim 1 , wherein said polynucleotide hybridizes under 
stringent conditions to a polynucleotide comprising any one of the sequences shown as SEQ ID 

25 NOs: 1 -241 or any one of the sequences of the clone inserts of the deposited clone pool. 

7. The polynucleotide of claim 5, wherein said nucleic acid sequence comprises any 
one of the sequences shown as SEQ ID NOs: 1 -241 or any one the sequences of the clone inserts of 
the deposited clone pool. 

30 

8. The polynucleotide of claim 1 , wherein said polynucleotide is operably linked to a 
promoter. 

9. An expression vector comprising the polynucleotide of claim 8. 

35 

10. A host cell recombinant for the polynucleotide of claim 1 . 
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11. A non-human transgenic animal comprising the host cell of claim 10. 



12. A method of making a GEN SET polypeptide, said method comprising 

a) providing a population of host cells comprising the polynucleotide of 
5 claim 8; and 

b) culturing said population of host cells under conditions conducive to the 
production of said polypeptide within said host cells. 

13. The method of claim 12, further comprising purifying said polypeptide from said 
1 0 population of host cells. 

14. A method of making a GENSET polypeptide, said method comprising 

a) providing a population of cells comprising the polynucleotide of claim 

8; 

15 b) culturing said population of cells under conditions conducive to the 

production of said polypeptide within said cells; and 
c) purifying said polypeptide from said population of cells. 

15. An isolated polynucleotide, said polynucleotide comprising a nucleic acid sequence 
20 having at least about 80% identity over at least about 100 contiguous nucleotides to any one of the 

sequences shown as SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of the 
deposited clone pool. 

16. The polynucleotide of claim 15, wherein said polynucleotide hybridizes under 
25 stringent conditions to a polynucleotide comprising any one of the sequences shown as SEQ ID 

NOs: 1-241 or any one of the sequences of the clone inserts of the deposited clone pool. 

17. The polynucleotide of claim 15, wherein said polynucleotide comprises any one of 
the sequences shown as SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of the 

30 deposited clone pool. 

1 8. A biologically active polypeptide encoded by the polynucleotide of claim 15. 

19. An isolated polypeptide or biologically active fragment thereof, said polypeptide 
35 comprising an amino acid sequence having at least about 80% sequence identity to any one of the 

sequences shown as SEQ ID NOs:242-482 or any one of the sequences of polypeptides encoded by 
the clone inserts of the deposited clone pool. 
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20. The polypeptide of claim 19, wherein said polypeptide is selectively recognized by 
an antibody raised against an antigenic polypeptide, or an antigenic fragment thereof, said antigenic 
polypeptide comprising any one of the sequences shown as SEQ ID NOs: 242-482 or any one of the 

5 sequences of polypeptides encoded by the clone inserts of the deposited clone pool. 

21 . The polypeptide of claim 19, wherein said polypeptide comprises any one of the 
sequences shown as SEQ ID NOs:242-482 or any one of the sequences of polypeptides encoded by 
the clone inserts of the deposited clone pool. 

10 

22. The polypeptide of claim 19, wherein said polypeptide comprises a signal peptide. 

23. The polypeptide of claim 19, wherein said polypeptide is a mature protein. 

15 24. An antibody that specifically binds to the polypeptide of claim 19. 

25. A method of determining whether a GENSET gene is expressed within a mammal, 
said method comprising the steps of: 

a) providing a biological sample from said mammal 
20 b) contacting said biological sample with either of: 

i) a polynucleotide that hybridizes under stringent conditions to the 
polynucleotide of claim 1; or 

ii) a polypeptide that specifically binds to the polypeptide of claim 19; and 
c) detecting the presence or absence of hybridization between said polynucleotide 

25 and an RNA species within said sample, or the presence or absence of binding 

of said polypeptide to a protein within said sample; 
wherein a detection of said hybridization or of said binding indicates that said GENSET gene is 
expressed within said mammal. 

30 26. The method of claim 25, wherein said polynucleotide is a primer, and wherein said 

hybridization is detected by detecting the presence of an amplification product comprising the 
sequence of said primer. 



35 



27. The method of claim 25, wherein said polypeptide is an antibody. 

28. A method of determining whether a mammal has an elevated or reduced level of 
GENSET gene expression, said method comprising the steps of : 
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a) providing a biological sample from said mammal; and 

b) comparing the amount of the polypeptide of claim 19, or of an RNA species 
encoding said polypeptide, within said biological sample with a level 
detected in or expected from a control sample; 

5 wherein an increased amount of said polypeptide or said RNA species within said biological 

sample compared to said level detected in or expected from said control sample indicates that said 
mammal has an elevated level of said GENSET gene expression, and wherein a decreased amount 
of said polypeptide or said RNA species within said biological sample compared to said level 
detected in or expected from said control sample indicates that said mammal has a reduced level of 
1 0 said GENSET gene expression. 

29. A method of identifying a candidate modulator of a GENSET polypeptide, said 
method comprising : 

a) contacting the polypeptide of claim 1 8 with a test compound; and 
15 b) determining whether said compound specifically binds to said 

polypeptide; 

wherein a detection that said compound specifically binds to said polypeptide indicates that 
said compound is a candidate modulator of said GENSET polypeptide. 



20 



549 



WO 01/42451 



PCT/IB00/01938 



1/5 



AfcoMI(133) 
Nael(\35) 



DmlTI (241) 



Amnl(2491) 
Asp 700 (2491) 
Beg I (2431) 
Bsa HI (2429) 
Acy I (2429) 
Sea I (2372) 
Ava II (2252) 



&a I (2022) 
Earn 11051(1961) 



Asp EI (1961) 



Figure 1 




Bgl I (476) 

Mu I (682) 

f 

\Dsa I (705) 
jjKsp I (708) 
|&c II (708) 
HjBst XI (709) 
|WorI(714) 

if 

(|/Z^I(714) 

/i// 

§11 Eel XI (714) 
fVA^ia III (714) 
M/lvrII (721) 

§/ Bin I 02V) 

If// 

Sty I (721) 



*x Muni (727) 
§\Mfe I (727) 



lmPm/I(735) 
|Wr PI (735) 
f|p™7 CI (735) 
|a///Md III (739) 
|ttiSa/ 1 (750) 
^Uccl(751) 

|ffihc II (752) 
|\y/iffrf II (752) 
mtapl20I (765) 
WPra 11 (765) 
|lEcoO109I (765) 
Mpa I (769) 
Lcc65I (771) 
\Asp 718 (771) 
A> I (775) 
Sap I (952) 
tfjp I (1072) 




Figure 2 



WO 01/42451 



PCT/IB00/01938 



3/5 

201 



200 



( START 




1 

STORE NEW SEQUE 


NCE TO A MEMORY 


i 


r 


OPEN DATABASE OF SEQUENCES 




r 


READ FIRST SEQUENCE IN DATABASE 




4 


r 


PERFORM COMPARISON OF NEW SEQUENCE AND 
STORED SEQUENCE 



202 



204 



206 



210 



GO TO NEXT 
SEQUENCE IN 

DATABASE 
1 




1 



no 220 

f end 



Figure 3 



WO 01/42451 



PCT/IB00/01938 



250 



4/5 

252 



c 



START 




1 

STORE A FIRST SEQU 


ENCETO A MEMORY 




r 


STORE A SECOND SEQUENCE TO A MEMORY 




r 


READ FIRST CHARACTER OF FIRST SEQUENCE 




r 


READ FIRST CHARACTER OF SECOND SEQUENCE 



254 



256 



260 



262 



YES 




DISPLAY HOMOLOGY LEVEL BETWEEN THE FIRST AND 
SECOND SEQUENCES 



I 



278 



( END 



Figure 4 



WO 01/42451 



PCT/IB00/01938 



5/5 

302 



300 



( START 




1 

STORE A FIRST SEQU 


ENCE TO A MEMORY 






OPEN A DATABASE OF SEQUENCE FEATURES 




f 


READ FIRST FEATURE FROM DATABASE 






4 


COMPARE FEATURE ATTRIBUTES WITH THE FIRST 
SEQUENCE 



304 



306 



308 



310 



READ NEXT 
FEATURE IN 
DATABASE 
z 




Figure 5 



WO 01/42451 



PCT/IB00/01938 



<110> GENSET 

<12 0> Full-length human cDNAs encoding potentially secreted proteins 

5 <130> 78.W01 

<150> US 60/169,629 

<151> 1999-12-08 

10 <150> US 60/187,470 

<151> 2000-03-06 

<160> 482 

15 <170> Patent. pm 

<210> 1 

<211> 2201 

<212> DNA 

20 £213> Homo sapiens 

<22 0>.,\ 

<221> CDS 



25 



<222> 169 . . 1692 



<220> 

<221> sig_peptide 
<222> 169. .249 
<223> Von Heijne matrix 
30 score 7.15265901862021 

seq VLLLLLLERGMFS/SP 

<400> 1 

agatgtgaat agctccacta taccagcctc gtcttccttc cgggggacaa cgtgggtcag 60 
35 ggcacagaga gatatttaat gtcaccctct tggggctttc atgggactcc ctctgccaca 120 
ttttttggag gttgggaaag ttgctagagg cttcagaact ccagccta atg gat ccc 177 

Met Asp Pro 
-25 

aaa etc ggg aga atg get gcg tec ctg ctg get gtg ctg ctg ctg ctg 225 
40 Lys Leu Gly Arg Met Ala Ala Ser Leu Leu Ala Val Leu Leu Leu Leu 

-20 -15 -10 

ctg ctg gag cgc ggc atg ttc tec tea ccc tec ccg ccc ccg gcg ctg 273 
Leu Leu Glu Arg Gly Met Phe Ser Ser Pro Ser Pro Pro Pro Ala Leu 
-5 1 5 

45 tta gag aaa gtc ttc cag tac att gac etc cat cag gat gaa ttt gtg 321 
Leu Glu Lys Val Phe Gin Tyr lie Asp Leu His Gin Asp Glu Phe Val 

10 15 20 

cag acg ctg aag gag tgg gtg gec ate gag age gac tct gtc cag cct 369 
Gin Thr Leu Lys Glu Trp Val Ala lie Glu Ser Asp Ser Val Gin Pro 
50 25 30 35 40 

gtg cct cgc ttc aga caa gag etc ttc aga atg atg gee gtg get gcg 417 
Val Pro Arg Phe Arg Gin Glu Leu Phe Arg Met Met Ala Val Ala Ala 

45 50 55 

gac acg ctg cag cgc ctg ggg gec cgt gtg gec teg gtg gac atg ggt 465 
55 Asp Thr Leu Gin Arg Leu Gly Ala Arg Val Ala Ser Val Asp Met Gly 
60 65 70 

cct cag cag ctg ccc gat ggt cag agt ctt cca ata cct ccc gtc ate 513 
Pro Gin Gin Leu Pro Asp Gly Gin Ser Leu Pro lie Pro Pro Val lie 
75 80 85 

60 ctg gec gaa ctg ggg age gat ccc acg aaa ggc acc gtg tgc ttc tac 561 
Leu Ala Glu Leu Gly Ser Asp Pro Thr Lys Gly Thr Val Cys Phe Tyr 

90 95 100 

ggc cac ttg gac gtg cag cct get gac egg ggc gat ggg tgg etc acg 609 
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Gly His Leu Asp Val Gin Pro Ala Asp Arg Gly Asp Gly Trp Leu Thr 

105 110 115 120 

gac ccc tat gtg ctg acg gag gta gac ggg aaa ctt tat gga cga gga 657 

Asp Pro Tyr Val Leu Thr Glu Val Asp Gly Lys Leu Tyr Gly Arg Gly 

5 125 130 135 

gcg acc gac aac aaa ggc cct gtc ttg get tgg ate aat get gtg age 705 

Ala Thr Asp Asn Lys Gly Pro Val Leu Ala Trp lie Asn Ala Val Ser 

140 145 150 

gee ttc aga gee ctg gag caa gat ctt cct gtg aat ate aaa ttc ate 753 

10 Ala Phe Arg Ala Leu Glu Gin Asp Leu Pro Val Asn' lie Lys Phe lie 

155 160 165 

att gag ggg atg gaa gag get ggc tct gtt gee ctg gag gaa ctt gtg 801 

lie Glu Gly Met Glu Glu Ala Gly Ser Val Ala Leu Glu Glu Leu Val 

170 175 180 

15 gaa aaa gaa aag gac cga ttc ttc tct ggt gtg gac tac att gta att 849 

Glu Lys Glu Lys Asp Arg Phe Phe Ser Gly Val Asp Tyr lie Val lie 

185 190 195 200 

tea gat aac ctg tgg ate age caa agg aag cca gca ate act tat gga 8 97 

Ser Asp Asn Leu Trp lie Ser Gin Arg Lys Pro Ala lie Thr Tyr Gly 

20 205 210 215 

acc egg ggg aac age tac ttc atg gtg gag gtg aaa tgc aga gac cag 945 

Thr Arg Gly Asn Ser Tyr Phe Met Val Glu Val Lys Cys Arg Asp Gin 

220 225 230 

gat ttt cac tea gga acc ttt ggt ggc ate ctt cat gaa cca atg get 993 

25 Asp Phe His Ser Gly Thr Phe Gly Gly lie Leu His Glu Pro Met Ala 

235 240 245 

gat ctg gtt get ctt etc ggt age ctg gta gac teg tct ggt cat ate 1041 

Asp Leu Val Ala Leu Leu Gly Ser Leu Val Asp Ser Ser Gly His lie 

250 255 260 

30 ctg gtc cct gga ate tat gat gaa gtg gtt cct ctt aca gaa gag gaa 1089 

Leu Val Pro Gly lie Tyr Asp Glu Val Val Pro Leu Thr Glu Glu Glu 

265 270 275 280 

ata aat aca tac aaa gee ate cat eta gac eta gaa gaa tac egg aat 1137 

lie Asn Thr Tyr Lys Ala lie His Leu Asp Leu Glu Glu Tyr Arg Asn 

35 285 290 295 

age age egg gtt gag aaa ttt ctg ttc gat act aag gag gag att eta 1185 

Ser Ser Arg Val Glu Lys Phe Leu Phe Asp Thr Lys Glu Glu lie Leu 

300 305 310 

atg cac etc tgg agg tac cca tct ctt tct att cat ggg ate gag ggc 1233 

40 Met His Leu Trp Arg Tyr Pro Ser Leu Ser lie His Gly lie Glu Gly 

315 320 325 

gcg ttt gat gag cct gga act aaa aca gtc ata cct ggc cga gtt ata 1281 

Ala Phe Asp Glu Pro Gly Thr Lys Thr Val lie Pro Gly Arg Val lie 

330 335 340 

45 gga aaa ttt tea ate cgt eta gtc cct cac atg aat gtg tct gcg gtg 1329 

Gly Lys Phe Ser lie Arg Leu Val Pro His Met Asn Val Ser Ala Val 

345 350 355 360 

gaa aaa cag gtg aca cga cat ctt gaa gat gtg ttc tec aaa aga aat 1377 

Glu Lys Gin Val Thr Arg His Leu Glu Asp Val Phe Ser Lys Arg Asn 

50 365 370 375 

agt tec aac aag atg gtt gtt tec atg act eta gga eta cac ccg tgg 1425 

Ser Ser Asn Lys Met Val Val Ser Met Thr Leu Gly Leu His Pro Trp 

380 385 390 

att gca aat att gat gac acc cag tat etc gca gca aaa aga gcg ate 1473 

55 lie Ala Asn lie Asp Asp Thr Gin Tyr Leu Ala Ala Lys Arg Ala lie 

395 400 405 

aga aca gtg ttt gga aca gaa cca gat atg ate egg gat gga tec acc 1521 

Arg Thr Val Phe Gly Thr Glu Pro Asp Met lie Arg Asp Gly Ser Thr 

410 415 420 

60 att cca att gec aaa atg ttc cag gag ate gtc cac aag age gtg gtg 1569 

lie Pro lie Ala Lys Met Phe Gin Glu lie Val His Lys Ser Val Val 

425 430 435 440 

eta att ccg ctg gga get gtt gat gat gga gaa cat teg cag aat gag 1617 
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10 



15 



Leu lie Pro Leu Gly Ala Val Asp Asp Gly Glu His Ser Gin Asn Glu 

445 450 455 

aaa ate aac agg tgg aac tac ata gag gga acc aaa tta ttt get gec 
Lys lie Asn Arg Trp Asn Tyr lie Glu Gly Thr Lys Leu Phe Ala Ala 

460 465 470 

ttt ttc tta gag atg gec cag etc cat taatcacaag aaccttctag 
Phe Phe Leu Glu Met Ala Gin Leu His 

475 480 
tctgatctga tccactgaca gattcacctc ccccacatcc ctagacaggg atggaatgta 
aatatccaga gaatttgggt ctagtatagt acattttccc ttccatttaa aatgtcttgg 
gatatctgga tcagtaataa aatatttcaa aggcacagat gttggaaatg gtttaaggtc 
ccccactgca caccttcctc aagtcatagc tgettgeage aacttgattt ccccaagtcc 
tgtgcaatag ccccaggatt ggattccttc caacctttta gcatatctcc aaccttgeaa 
tttgattggc ataatcactc cagtttgett tctaggtcct caagtgeteg tgacacataa 
tcattccatc caatgatege etttgettta ccactctttc cttttatctt attaataaaa 
atgttggtct ccaccactga aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaagaaaaaa 
aaaaaaaaa 



1665 



1712 



1772 
1832 
1892 
1952 
2012 
2072 
2132 
2192 
2201 



<210> 2 

20 <211> 1631 

<212> DNA 

<213> Homo sapiens 



<220> 
25 <221> CDS 
<222> 148. 



. 1140 



<220> 

<221> sig_peptide 
30 <222> 148 . .240 

<223> Von Heijne matrix 

score 10.0910253445132 
seq LVLLLVTRSPVNA/CL 



35 



40 



45 



50 



55 



60 



<400> 2 

gtctgctgcc gccattgtgc ggcgctggtc ccctcagagg gttcctgctg ctgccggtgc 
cttggaccct ccccctcgct tetegttcta ctgccccagg agcccggcgg gtcegggact 
cccgtccgtg ccggtgcggg cgccggc atg tgg ctg tgg gag gac cag ggc ggc 

Met Trp Leu Trp Glu Asp Gin Gly Gly 
-30 -25 
etc ctg ggc cct ttc tec ttc ctg ctg eta gtg ctg ctg ctg gtg acg 
Leu Leu Gly Pro Phe Ser Phe Leu Leu Leu Val Leu Leu Leu Val Thr 

-20 -15 -10 

egg age ccg gtc aat gee tgc etc etc acc ggc age etc ttc gtt eta 
Arg Ser Pro Val Asn Ala Cys Leu Leu Thr Gly Ser Leu Phe Val Leu 



-5 



10 



ctg cgc gtc ttc age ttt gag ccg gtg ccc tct tgc agg gec ctg cag 



Leu Arg Val Phe Ser Phe Glu Pro Val 
15 



Pro Ser Cys Arg Ala Leu Gin 
20 



25 

gtg etc aag ccc egg gac cgc att tct gec ate gec cac cgt ggc ggc 
Val Leu Lys Pro Arg Asp Arg lie Ser Ala lie Ala His Arg Gly Gly 



30 



35 



40 



age cac gac gcg ccc gag aac acg ctg gcg gec att egg cag gca get 
Ser His Asp Ala Pro Glu Asn Thr Leu Ala Ala lie Arg Gin Ala Ala 

45 50 55 

aag aat gga gca aca ggc gtg gag ttg gac att gag ttt act tct gac 
Lys Asn Gly Ala Thr Gly Val Glu Leu Asp lie Glu Phe Thr Ser Asp 
60 65 70 

999 att cct 9 tc tta at 9 cac 9 at aac aca 9 ta 9 at a 99 ac 9 act 9 at 

Gly lie Pro Val Leu Met His Asp Asn Thr Val Asp Arg Thr Thr Asp 
75 80 85 90 

ggg a ct ggg cga ttg tgt gat ttg aca ttt gaa caa att agg aag ctg 

Gly Thr Gly Arg Leu Cys Asp Leu Thr Phe Glu Gin lie Arg Lys Leu 



60 
120 
174 



222 



270 



318 



366 



414 



462 



510 



558 
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95 100 105 

aat cct gca gca aac cac aga etc agg aat gat ttc cct gat gaa aag 

Asn Pro Ala Ala Asn His Arg Leu Arg Asn Asp Phe Pro Asp Glu Lys 
110 115 120 

5 ate cct acc eta atg gaa get gtt gca gag tgc eta aac cat aac etc 

lie Pro Thr Leu Met Glu Ala Val Ala Glu Cys Leu Asn His Asn Leu 

125 130 135 

aca ate ttc ttt gat gtc aaa ggc cat gca cac aag get act gag get 

Thr lie Phe Phe Asp Val Lys Gly His Ala His Lys Ala Thr Glu Ala 

10 140 145 150 

eta aag aaa atg tat atg gaa ttt cct caa ctg tat aat aat agt gtg 

Leu Lys Lys Met Tyr Met Glu Phe Pro Gin Leu Tyr Asn Asn Ser Val 
155 160 165 170 

gtc tgt tct ttc ttg cca gaa gtt ate tac aag atg aga caa aca gat 

15 Val Cys Ser Phe Leu Pro Glu Val lie Tyr Lys Met Arg Gin Thr Asp 

175 180 185 

c 99 9 at 9 ta ata aca 9 ca tta act cac a 9 a cct tgg a 9 c cta a 9 c cat 

Arg Asp Val lie Thr Ala Leu Thr His Arg Pro Trp Ser Leu Ser His 
190 195 200 

20 aca gga gat ggg aaa cca cgc tat gat act ttc tgg aaa cat ttt ata 

Thr Gly Asp Gly Lys Pro Arg Tyr Asp Thr Phe Trp Lys His Phe lie 

205 210 215 

ttt gtt atg atg gac att ttg etc gat tgg age atg cat aat ate ttg 

Phe Val Met Met Asp lie Leu Leu Asp Trp Ser Met His Asn lie Leu 

25 220 225 230 

tgg tac ct 9 t9t 99 a at t tea get ttc etc atg caa aag gat ttt gta 

Trp Tyr Leu Cys Gly lie Ser Ala Phe Leu Met Gin Lys Asp Phe Val 
235 240 245 250 

tec ccg gee tac ttg aag aag tgg tea get aaa gga ate cag gtt gtt 

30 Ser Pro Ala Tyr Leu Lys Lys Trp Ser Ala Lys Gly lie Gin Val Val 

255 260 265 

99t tgg act gtt aat acc ttt gat gaa aag agt tac tac gaa tec cat 

Gly Trp Thr Val Asn Thr Phe Asp Glu Lys Ser Tyr Tyr Glu Ser His 
270 275 280 

35 ctt ggt tec age tat ate act gac age atg gta gaa gac tgc gaa cct 

Leu Gly Ser Ser Tyr lie Thr Asp Ser Met Val Glu Asp Cys Glu Pro 

285 290 295 

cac ttc tagactttca cggtgggacg aaacgggttc agaaactgee aggggectea 
His Phe 
40 300 

tacagggata tcaaaatacc ctttgtgcta gcccaggccc tggggaatca ggtgactcac 

acaaatgeaa tagttggtca ctgcattttt acctgaacca aagctaaacc cggtgttgcc 

accatgcacc atggcatgcc agagttcaac actgttgctc ttgaaaatct ggggtctgaa 

aaaaegcaca agagcccctg ccctgcccta gctgaggcac acagggagac ccagtgagga 

45 taagcacaga ttgaattgta caatttgeag atgcagatgt aaatgcatgg gaeatgeatg 

ataactcaga gttgacattt taaaacttgc cacacttatt tcaaatattt gtactcagct 

atgttaacat gtactgtaga catcaaactt gtggccatac taataaaatt attaaaagga 
gcacaaaaaa aaaaaaaaaa a 



606 



654 



702 



750 



798 



846 



894 



942 



990 



1038 



1086 



1134 



1190 



1250 
1310 
1370 
1430 
1490 
1550 
1610 
1631 



50 <210> 3 

<211> 1245 
<212> DNA 

<213> Homo sapiens 



55 <220> 

<221> CDS 
<222> 85 . . 906 



<220> 

60 <221> sig_peptide 
<222> 85 . . 135 
<223> Von Heijne matrix 

score 3.86022363031904 
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seq GFVAALVAGGVAG/VS 
<400> 3 

aaaacatggc ggcgcccagc gcgcgaggac gtgatccgct tctgctccgg cttggattgt 60 
5 agccttgacg aggtctgagc gacc atg gac egg ccg ggg ttc gtg gca gcg 111 

Met Asp Arg Pro Gly Phe Val Ala Ala 
-15 -10 
ctg gtg get ggt ggg gta gca ggt gtt tct gtt gac ttg ata tta ttt 159 
Leu Val Ala Gly Gly Val Ala Gly Val Ser Val Asp Leu lie Leu Phe 
10-5 15 

cct ctg gat acc att aaa acc agg ctg cag agt ccc caa gga ttt agt 207 
Pro Leu Asp Thr lie Lys Thr Arg Leu Gin Ser Pro Gin Gly Phe Ser 

10 15 20 

aag get ggt ggt ttt cat gga ata tat get ggc gtt cct tct get get 255 
15 Lys Ala Gly Gly Phe His Gly lie Tyr Ala Gly Val Pro Ser Ala Ala 
25 30 35 40 

att gga tec ttt cct aat get get gca ttt ttt ate acc tat gaa tat 303 
lie Gly Ser Phe Pro Asn Ala Ala Ala Phe Phe lie Thr Tyr Glu Tyr 
45 50 55 

20 gtg aag tgg ttt ttg cat get gat tea tct tea tat ttg aca cct atg 351 
Val Lys Trp Phe Leu His Ala Asp Ser Ser Ser Tyr Leu Thr Pro Met 

60 65 70 

aaa cat atg ttg get gee tct get gga gaa gtg gtt gee tgc ctg att 3 99 

Lys His Met Leu Ala Ala Ser Ala Gly Glu Val Val Ala Cys Leu lie 
25 75 80 85 

cga gtt cca tct gaa gtg gtt aag cag agg gca cag gta tct get tct 447 
Arg Val Pro Ser Glu Val Val Lys Gin Arg Ala Gin Val Ser Ala Ser 

90 95 100 

aca aga aca ttt cag att ttc tct aac ate tta tat gaa gag ggt ate 495 
30 Thr Arg Thr Phe Gin lie Phe Ser Asn lie Leu Tyr Glu Glu Gly lie 
105 110 115 120 

caa ggg ttg tat cga ggc tat aaa age aca gtt tta aga gag att cct 543 
Gin Gly Leu Tyr Arg Gly Tyr Lys Ser Thr Val Leu Arg Glu lie Pro 
125 130 135 

35 ttt tct ttg gtc cag ttt ccc tta tgg gag tec tta aaa gee etc tgg 591 
Phe Ser Leu Val Gin Phe Pro Leu Trp Glu Ser Leu Lys Ala Leu Trp 

140 145 150 

tec tgg agg cag gat cat gtg gtg gat tct tgg cag tea gca gtc tgt 63 9 

Ser Trp Arg Gin Asp His Val Val Asp Ser Trp Gin Ser Ala Val Cys 
40 155 160 165 

gga get ttt gca ggt gga ttt gec get gca gtc acc acc cct eta gac 687 
Gly Ala Phe Ala Gly Gly Phe Ala Ala Ala Val Thr Thr Pro Leu Asp 

170 175 180 

gtg gca aag aca aga att atg ctg gca aag get ggc tec age act get 735 
45 Val Ala Lys Thr Arg lie Met Leu Ala Lys Ala Gly Ser Ser Thr Ala 
185 190 195 200 

gat ggg aat gtg etc tct gtc ctg cat ggg gtc tgg egg tea cag ggg 7 83 

Asp Gly Asn Val Leu Ser Val Leu His Gly Val Trp Arg Ser Gin Gly 
205 210 215 

50 ctg gca gga tta ttt gca ggt gtc ttc cct cga atg gca gec ate agt 831 
Leu Ala Gly Leu Phe Ala Gly Val Phe Pro Arg Met Ala Ala lie Ser 

220 225 230 

ctg gga ggt ttc ate ttt ctg ggg get tat gac cga acg cac age ttg 879 
Leu Gly Gly Phe lie Phe Leu Gly Ala Tyr Asp Arg Thr His Ser Leu 
55 235 240 245 

ctg ttg gaa gtt ggc aga aag agt cct tgaagcagag acaagcctca 92 6 

Leu Leu Glu Val Gly Arg Lys Ser Pro 

250 255 
cctccacttc tgtcaagaga ggggectgea gtgcaaaccc tcttccgctg agcagctgtc 986 
60 tgaactatag gccccagtgc tgaagaccag ttgtgctaag ataceggcat ggagattgtg 104 6 
ccatccgtgg tataggctgg ctggtatgaa gtcattggcc tgtatgccag agagctaaga 1106 
gaagaaaacg gggtctgtgg eggtactctg aacaatttcc tcagaacctc ttaataaata 1166 
agtttggtaa tgctgagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 12 2 6 
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agaaaaaaaa aaaaaaaaa 1245 

<210> 4 
<211> 1623 
5 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
10 <222> 31 . . 1248 

<220> 

<221> sig_peptide 
<222> 31. .135 
15 <223> Von Heijne matrix 

score 6.3770152988307 

seq TLLLFAAPFGLLG/EK 

<400> 4 

20 aacctcttcc gtcggctgaa ttgcggccgt atg cgc ggc tct gtg gag tgc acc 54 

Met Arg Gly Ser Val Glu Cys Thr 
-35 -30 

tgg ggt tgg ggg cac tgt g cc ccc a g c ccc ct g ctc ctt tgg act cta 102 

Trp Gly Trp Gly His Cys Ala Pro Ser Pro Leu Leu Leu Trp Thr Leu 

25 -25 -20 -15 

ctt ctg ttt gca gcc cca ttt ggc ctg ctg ggg gag aag acc cgc cag 150 

Leu Leu Phe Ala Ala Pro Phe Gly Leu Leu Gly Glu Lys Thr Arg Gin 

-10 -5 15 

gtg tct ctg gag gtc ate cct aac tgg ctg ggc ccc ctg cag aac ctg 198 

30 Val Ser Leu Glu Val lie Pro Asn Trp Leu Gly Pro Leu Gin Asn Leu 

10 15 20 

ctt cat ata egg gca gtg ggc acc aat tec aca ctg cac tat gtg tgg 246 

Leu His lie Arg Ala Val Gly Thr Asn Ser Thr Leu His Tyr Val Trp 
25 30 35 

35 agc agc ctg ggg cct ctg gca gtg gta atg gtg gcc acc aac acc ccc 294 

Ser Ser Leu Gly Pro Leu Ala Val Val Met Val Ala Thr Asn Thr Pro 

40 45 50 

cac agc acc ctg agc gtc aac tgg agc ctc ctg cta tec cct gag ccc 342 

His Ser Thr Leu Ser Val Asn Trp Ser Leu Leu Leu Ser Pro Glu Pro 

40 55 60 65 

gat ggg ggc ctg atg gtg ctc cct aag gac agc att cag ttt tct tct 3 90 

Asp Gly Gly Leu Met Val Leu Pro Lys Asp Ser lie Gin Phe Ser Ser 

70 75 80 85 

gcc ctt gtt ttt acc agg ctg ctt gag ttt gac agc acc aac gtg tec 438 

45 Ala Leu Val Phe Thr Arg Leu Leu Glu Phe Asp Ser Thr Asn Val Ser 

90 95 100 

gat acg gca gca aag cct ttg gga aga cca tat cct cca tac tec ttg 486 

Asp Thr Ala Ala Lys Pro Leu Gly Arg Pro Tyr Pro Pro Tyr Ser Leu 
105 110 115 

50 gcc gat ttc tct tgg aac aac ate act gat tea ttg gat cct gcc acc 534 

Ala Asp Phe Ser Trp Asn Asn lie Thr Asp Ser Leu Asp Pro Ala Thr 

120 125 130 

ctg agt gcc aca ttt caa ggc cac ccc atg aac gac cct acc agg act 582 

Leu Ser Ala Thr Phe Gin Gly His Pro Met Asn Asp Pro Thr Arg Thr 

55 135 140 145 

ttt gcc aat ggc agc ctg gcc ttc agg gtc cag gcc ttt tec agg tec 630 

Phe Ala Asn Gly Ser Leu Ala Phe Arg Val Gin Ala Phe Ser Arg Ser 

150 155 160 165 

age cga cca gcc caa ccc cct cgc ctc ctg cac aca gca gac acc tgt 678 

60 Ser Arg Pro Ala Gin Pro Pro Arg Leu Leu His Thr Ala Asp Thr Cys 

170 175 180 

cag cta gag gtg gcc ctg att gga gcc tct ccc egg gga aac cgt tec 726 

Gin Leu Glu Val Ala Leu lie Gly Ala Ser Pro Arg Gly Asn Arg Ser 
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10 



15 



20 



30 



35 



40 



185 190 
ctg ttt ggg ctg gag gta gcc aca ttg ggc cag 
Leu Phe Gly Leu Glu Val Ala Thr Leu Gly Gin 

200 205 
tea atg cag gag cag cac tec ate gac gat gaa 
Ser Met Gin Glu Gin His Ser lie Asp Asp Glu 

215 220 
ttc cag ttg gac cag eta ctg tgg ggc tec etc 
Phe Gin Leu Asp Gin Leu Leu Trp Gly Ser Leu 
230 235 240 

cag tgg cga cca gtg get tac tec cag aag ccg 
Gin Trp Arg Pro Val Ala Tyr Ser Gin Lys Pro 



250 



255 



gcc ctg ccc tgc caa get tec cct ctt cat cct 
Ala Leu Pro Cys Gin Ala Ser Pro Leu His Pro 
265 

ctt ccc cag tea ccc att gtc 



270 

cga gcc ttc ttt 



Leu Pro Gin Ser Pro lie Val Arg Ala Phe Phe 



280 



285 



ttc tgt gcc ttc aat ctg acg ttc ggg get tec 

Phe Cys Ala Phe Asn Leu Thr Phe Gly Ala Ser 
295 

tgg g ac caa cac tac ctc 



300 
age 



tgg teg atg ctc 



Trp Asp Gin His Tyr Leu Ser Trp Ser Met Leu 



25 310 



315 



cct cca gtg gac ggc ttg tec cca eta gtc 



320 
ctg 



Pro Pro Val Asp Gly Leu Ser Pro Leu Val Leu 



330 



335 



gcc ctg ggt gcc cca ggg ctc atg ctg eta ggg 
Ala Leu Gly Ala Pro Gly Leu Met Leu Leu Gly 



345 



350 



ctg ctg cac cac aag aag tac tea gag tac cag 
Leu Leu His His Lys Lys Tyr Ser Glu Tyr Gin 



360 



365 



taaggcccgc tctctggagg gaaggacatt actgaacctg 
ctggaggttg gagcatcaag ttccagcccc cttcactccc 
acctcagagg ccagcctcga cttcctggag acccccaggt 
ttgggggact ttggaggcgg gcaggggaca gggctattga 
ettcttgeat ctccacacat ttcccttgga tgggacttgc 
ctgactggtt ggctgccctg gaaggcaaga aaatagattt 
aaaaaaaaaa aaaaa 



195 

ggc cct gac tgc ccc 
Gly Pro Asp Cys Pro 
210 

tat gca ccg gcc gtc 
Tyr Ala Pro Ala Val 
225 

cca tea ggc ttt gca 
Pro Ser Gly Phe Ala 
245 

ggg gg c c g a g aa tea 

Gly Gly Arg Glu Ser 
260 

gcc tta gca tac tct 
Ala Leu Ala Tyr Ser 
275 

ggg tec cag aat aac 
Gly Ser Gin Asn Asn 
290 

aca ggc cct ggc tat 
Thr Gly Pro Gly Tyr 
305 

ctg ggt gtg ggc ttc 
Leu Gly Val Gly Phe 
325 

ggc ate atg gca gtg 
Gly lie Met Ala Val 
340 

gg c ggc ttg gtt ctg 

Gly Gly Leu Val Leu 
355 

tec ata aat 
Ser lie Asn 
370 

tcttgctgtg cctcgaaact 
ccatcttgct tttctgtgga 
ggggcttcct tcatactttg 
taaggtcccc ttggtgttgc 
aggectaaat gagaggcatt 
attttttttt cacagggcaa 



774 



822 



870 



918 



966 



1014 



1062 



1110 



1158 



1206 



1248 



1308 
1368 
1428 
1488 
1548 
1608 
1623 



<210> 5 
<211> 1454 
45 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
50 <222> 72 . . 143 

<220> 

<221> sig_peptide 

<222> 72 . . 119 
55 <223> Von Heijne matrix 

score 5.68931280801877 
seq LGMLLGLLMAACT/PS 



<400> 5 

60 gtgtctgcca ctcggctgcc ggaggccgaa ggtccctgac tatggctccc cagagcctgc 
cttcatctag g atg get cct ctg ggc atg ctg ctt ggg ctg ctg atg gcc 
Met Ala Pro Leu Gly Met Leu Leu Gly Leu Leu Met Ala 
-15 -10 -5 



60 
110 
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gcc tgc aca 
Ala Cys Thr 

aacccagaga 
5 ctggatgccg 
gggcaggctg 
gcaaaactcc 
aacaccaaca 
gcagagatgg 

10 cgccccattg 
atgcagatca 
aagattgctg 
ctgctttcct 
gtgaaggagt 

15 gtggaggcca 
ccgctcactg 
cacggagaag 
gctgcagcag 
gatcacggcc 

20 actgggcgtc 
gacactggcc 
tgaggacgag 
gagatgaggc 
gcgtgggtgg 

25 gaaacctgaa 



cct tct gcc tea gtc 
Pro Ser Ala Ser Val 
1 5 
agagcagcac caaagaaacg 
aagtcctgga ggtgttccac 
tccctgcagg atcccacgta 
aatatgagga caagttccga 
cctacacatc tcaggatctc 
agagttcaaa ggaagacaag 
aggaactgaa gaaagacttt 
tggtaegget gatcaacaag 
cgctctttga tcttgaatat 
ttggtggtct tcaagtggtg 
atgctgcgtt tgtgctgggc 
tcgaaggggg agccctgcag 
caaagggagg tgctcaccgt 
atgttcgccg aggaggaggc 
tatcgecagg tacacctcct 
cacctcctgg cgctgcccga 
ctcctgacca cctgccggga 
agectgeagg ctgagtacca 
ggctacttcc aggagctget 
cccacaccag gactggactg 
gcttctcagg caggaggaca 
ggccaaaaaa aaaaaaaaaa 



ate aga acc 
lie Arg Thr 

gagagaaaag 
ccgacgcatg 
eggctgaate 
aataatttga 
aagagtgcac 
gcaaggcagg 
gatgagctga 
ttcaatagtt 
tatgtccatc 
atcaatgggc 
getgectttt 
aagctgctgg 
gcgcgtggtc 
tgagctgacc 
gccakgcctg 
geatgatgee 
ccgctaccgt 
ggtgctggcc 
gggctctgtc 
ggatgecget 
tcttggcagt 
a 



tgaaggagtt tgccctgacc 



aaaccaaagc 
agtggcaggc 
ttcagactgg 
aaggcaaaag 
tggcaaaatt 
ctgaggtaaa 
atgttgtcat 
ccagctccag 
agatggacaa 
tgaacagcac 
ccagcaaccc 
tcatcctggc 
acactgctct 
caggagatgt 
tgggaacagg 
ygtgagaagg 
caggaccccc 
agectggage 
aacagcttgc 
agtgaggctg 
gctggcttgg 



cgaggaggag 
ccttcagcca 
ggaaagagag 
gctggatatc 
caaggagggg 
gcggctcttc 
tgagactgac 
tttggaagag 
tgegcaggae 
agagcccctc 
caaggtccag 
caeggagcag 
acgacctggt 
ccccagagaa 
gctggtgcga 
tgetgewgae 
ageteggcag 
tgcaggatgg 
tgaaggagct 
aggggtgeca 
ccattaaatg 



163 



223 
283 
343 
403 
463 
523 
583 
643 
703 
763 
823 
883 
943 
1003 
1063 
1123 
1183 
1243 
1303 
1363 
1423 
1454 



<210> 6 

<211> 1639 

<212> DNA 

30 <213> Homo sapiens 



35 



40 



<220> 
<221> CDS 
<222> 111. 



1154 



<220> 

<221> sig_peptide 

<222> 111 . . 197 

<223> Von Heijne matrix 

score 4.68065944212013 
seq LLGPLMAACFTFC/LS 



45 



<400> 6 

agaeggtege cgccgcgttt gcgcaggggg agetggtege 
9t999 a 9ttg tgtctgccac tcggctgccg gaggecgaag 





ccc 


cag 


age 




Pro 


Gin 


Ser 








-25 


50 


999 


ccg 


ctg 




Gly 


Pro 


Leu 






-10 






ctg 


aag 


gag 




Leu 


Lys 


Glu 


55 










aca 


gag 


aga 




Thr 


Glu 


Arg 




ctg 


gag 


gtg 


60 


Leu 


Glu 


Val 








40 




cag 


get 


gtc 




Gin 


Ala 


Val 



ctg 
Leu 

atg 
Met 

ttt 
Phe 

aaa 

Lys 

25 

ttc 

Phe 

cct 
Pro 



cct 
Pro 

gcc 
Ala 

gcc 

Ala 

10 

gaa 

Glu 

cac 
His 

gca 
Ala 



tea 
Ser 

gcc 
Ala 

ctg 
Leu 

acc 
Thr 

ccg 
Pro 

gga 
Gly 



tct 
Ser 

tgc 
Cys 
-5 
acc 
Thr 



agg atg get cct 
Arg Met Ala Pro 
-20 

ttc acc ttc tgc 
Phe Thr Phe Cys 



aac cca gag aag 
Asn Pro Glu Lys 
15 

aaa gcc gag gag gag 
Lys Ala Glu Glu Glu 
30 

acg cat gag tgg cag 
Thr His Glu Trp Gin 
45 

tec cac gta egg ctg 
Ser His Val Arg Leu 



cgccgcggcc gectggaatt 
gtccctgact atg get 
Met Ala 
ctg ggc atg ctg ctt 
Leu Gly Met Leu Leu 
-15 

etc agt cat cag aac 
Leu Ser His Gin Asn 
1 5 
age age acc aaa gaa 
Ser Ser Thr Lys Glu 
20 

ctg gat gcc gaa gtc 
Leu Asp Ala Glu Val 
35 

gcc ctt cag cca ggg 
Ala Leu Gin Pro Gly 
50 

aat ctt cag act ggg 
Asn Leu Gin Thr Gly 



60 
116 

164 



212 



260 



308 



356 



404 
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55 



60 



65 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



gaa 


aga 


aaa 

33 33 


qca 


aaa 


etc 


caa 


tat 


aaa 

33 ^*33 


aac 

ZJ 


aag 


ttc 


caa 


aat 


aat 


tta 

«- 33 


Glu 


Arg 


Glu 


Ala 


Lys 


Leu 


Gin 


Tvr 


Glu 


Asp 


Lys 


Phe 


Arg 


Asn 


Asn 


Leu 


70 










75 










80 










85 


aaa 


qqc 


aaa 


aaa 


eta 

33 


aat 


ate 


aac 


acc 


aac 


acc 


tac 


aca 


tct 


caa 

ZJ 


aat 

33 *-* *- 


Lvs 


Glv 


Lys 


Ara 


Leu 


Asp 


lie 


Asn 


Thr 


Asn 


Thr 


Tvr 


Thr 


Ser 


Gin 


Asp 










90 










95 










100 




etc 


aag 


aat 

33 *- 


qca 


eta 


qca 


aaa 


ttc 


aag 


aaa 

33 *-*33 


aaa 

33 33 33 


aca 


aaa 

ZJ *-*33 


ata 

ZJ 


aaa 


aat 

33 *- 


Leu 


Lys 


Ser 


Ala 


Leu 


Ala 


Lys 


Phe 


Lys 


Glu 


Glv 


Ala 


Glu 


Met 


Glu 


Ser 








105 










110 










115 






tea 


aag 


gaa 


gac 


aag 


gca 


aaa 
«yy 


cag 


act 

y wL " 


aaa 


gta 


aag 


caa 
^yy 


etc 


ttc 


cgc 


Ser 


Lys 


Glu 


Asp 


Lys 


Ala 


Arg 


Gin 


Ala 


Glu 


Val 


Lys 


Arg 


Leu 


Phe 


Ara 
^.j.y 






12 0 










125 










13 0 








ccc 


att 


aaa 


gaa 


ctg 


aag 


aaa 


gac 


ttt 


gat 


aaa 

y «=*y 


eta 

*- ZJ 


aat 


gtt 


gtc 


att 


Pro 


lie 


Glu 


Glu 


Leu 


Lys 


Lys 


Asp 


Phe 


Asp 


Glu 


Leu 


Asn 


Val 


Val 


lie 




135 










140 










145 










aaa 

33 a ZJ 


act 


gac 


atg 


cag 


ate 


atg 


gta 


caa 

^ 33 33 


ctg 


ate 


aac 


aag 


ttc 


aat 


agt 


Glu 


Thr 


Asp 


Met 


Gin 


lie 


Met 


Val 


Arg 


Leu 


lie 


Asn 


Lys 


Phe 


Asn 


Ser 


15 0 










155 










160 










165 


tec 


age 


tec 


agt 


ttg 


gaa 


aaa 
y a y 


aag 


att 


get 


aca 
y^y 


etc 


ttt 


gat 


ctt 


gaa 


Ser 


Set* 


Ser 


S er 


Leu 


Glu 


Glu 


Lys 


lie 


Ala 


Ala 


Leu 


Phe 


Asp 


Leu 


Glu 










170 










175 










180 




tat 


tat 


gtc 


cat 


cag 


atg 


gac 


aat 


aca 
y^y 


cag 


gac 


ctg 


ctt 


tec 


ttt 


aat 
yy 






Val 


His 


Gin 


Met 


Asp 


Asn 


Ala 


Gin 


Asp 


Leu 


Leu 


Ser 


Phe 


Gly 








185 










190 










195 






aat 


Ctt 


caa 


ata 

y *-y 


ata 
y *-y 


ate 


aat 


aaa 
yyy 


ctg 


aac 


age 


aca 


aaa 
y «y 


CCC 


etc 


ata 
y l -y 


Gly 


Leu 


Gin 


Val 


Val 


lie 


Asn 


Gly 


Leu 


Asn 


Ser 


Thr 


Glu 


Pro 


Leu 


Val 






2 0 0 










2 05 










210 








aag 


aaa 
y c*y 


tat 


get 


acid 
y *-y 


ttt 


ata 
y *-y 


ctg 


aac 
yy^ 


get 


gc c 


ttt 


tec 


age 


aac 


ccc 


Lys 


Glu 


Tvr 


Ala 


Ala 


Phe 


Val 


Leu 


Glv 


Ala 


Ala 


Phe 


Ser 


Ser 


Asn 


Pro 




215 










22 0 










22 5 










aaa, 


gt c 


cag 


ata 
y *-y 


aaa 
y <-*y 


gee 


ate 


gaa 


aaa 
yyy 


aaa 

y y «. 


gc c 


c tg 


cag 


aag 


ctg 


ctg 


Ly s 


Val 


Gin 


Val 


Glu 


Ala 


lie 


Glu 


Gly 


Gly 


Ala 


Leu 


Gin 


Lys 


Leu 


Leu 


2 3 0 










2 35 










24 0 










245 


gt C 


ate 


ctg 


gee 


acg 


aaa 
y a y 


cag 


ccg 


etc 


act 


gca 


aag 


aag 


aag 


gtc 


c tg 


Val 


lie 


Leu 


Ala 


Thr 


Glu 


Gin 


Pro 


Leu 


Thr 


Ala 


Lys 


Lys 


Lys 


Val 


Leu 










2 50 










2 55 










2 60 




ttt 


gca 


ctg 


tgc 


tec 


ctg 


ctg 


cgc 


cac 


ttc 


ccc 


tat 


gec 


cag 


caa 

^33 33 


cag 


Phe 


Ala 


Leu 


Cvs 


Ser 


Leu 


Leu 


Arg 


His 


Phe 


Pro 


Tvr 


Ala 


Gin 


Arg 


Gin 








2 65 










2 70 










2 75 






ttc 


ctg 


aag 


etc 


aaa 

Z3z3z3 


aaa 
yyy 


ctg 


cag 


gtc 


ctg 


agg 


acc 


ctg 


gtg 


cag 


gag 


Phe 


Leu 


Lvs 


Leu 


Gly 


Gly 


Leu 


Gin 


Val 


Leu 


Arg 


Thr 


Leu 


Val 


Gin 


Glu 






280 










285 










290 








aag 


ggc 


acg 


gag 


gtg 


etc 


gee 


gtg 


cgc 


gtg 


gtc 


aca 


ctg 


etc 


tac 


gac 


Lys 


Gly 


Thr 


Glu 


Val 


Leu 


Ala 


Val 


Arg 


Val 


Val 


Thr 


Leu 


Leu 


Tyr 


Asp 




295 










300 










305 










ctg 


gtc 


acg 


gag 


aag 


atg 


ttc 


gec 


gag 


gag 


taggctgagc tgacccagga 


Leu 


Val 


Thr 


Glu 


Lys 


Met 


Phe 


Ala 


Glu 


Glu 















310 



315 



gatgtcccca gagaagctgc agcagtatcg ccaggtacac 
acagggctgg tgegagatea cggcccacct cctggcgctg 
gaaggtgctg cagacactgg gcgtcctcct gaccacctgc 
cccccagctc ggcaggacac tggccagcct gcaggctgag 
ggagctgcag gatggtgagg acgagggcta cttccaggag 
ettgetgaag gagctgagat gaggccccac accaggactg 
ggctgagggg tgccagcgtg ggtgggcttc tcaggcagga 
ettggecatt aaatggaaac ctgaaggcaa aaaaaaaaaa 



ctcctgccag gcctgtggga 
cccgagcatg atgcccgtga 
cgggaccgct acegtcagga 
taccaggtgc tggccagcct 
ctgctgggct ctgtcaacag 
gactgggatg ccgctagtga 
ggacatcttg gcagtgctgg 
aaaaa 



452 



500 



548 



596 



644 



692 



740 



788 



836 



884 



932 



980 



1028 



1076 



1124 



1174 



1234 
1294 
1354 
1414 
1474 
1534 
1594 
1639 



<210> 7 
60 <211> 1768 
<212> DNA 

<213> Homo sapiens 
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<220> 
<221> CDS 
<222> 66 . . 1256 

5 <220> 

<221> sig_peptide 

<222> 66 . . 173 

<223> Von Heijne matrix 

score 4.89555877630516 
10 seq LLLLRLNDAALRA/LQ 

<400> 7 

a 9 a 99 a 99 fc 9 9cggtggtgg ccctcgcctg tggcccccgt gctgcttgca ctcgaactcg 60 
tcgcc atg gag gag etc cag gag cct ctg aga gga cag etc egg etc tgc 110 

15 Met Glu Glu Leu Gin Glu Pro Leu Arg Gly Gin Leu Arg Leu Cys 

-35 -30 -25 

ttc acg caa get gee egg act age etc tta ctg etc agg etc aac gac 158 

Phe Thr Gin Ala Ala Arg Thr Ser Leu Leu Leu Leu Arg Leu Asn Asp 
-20 -15 -10 

20 get gee ctg egg gcg ctg caa gag tgt cag egg caa cag gta egg ccg 2 06 

Ala Ala Leu Arg Ala Leu Gin Glu Cys Gin Arg Gin Gin Val Arg Pro 
-5 15 10 

gtg att get ttc caa ggc cac cga ggg tat ctg aga etc cca ggc cct 254 

Val lie Ala Phe Gin Gly His Arg Gly Tyr Leu Arg Leu Pro Gly Pro 

25 15 20 25 

ggt tgg tec tgc etc ttc tec ttc ata gtg tec cag tgt tgt cag gag 302 

Gly Trp Ser Cys Leu Phe Ser Phe lie Val Ser Gin Cys Cys Gin Glu 

30 35 40 

ggc get ggt ggt age ttg gac ctt gtg tgc caa cgc ttc etc agg tct 350 

30 Gly Ala Gly Gly Ser Leu Asp Leu Val Cys Gin Arg Phe Leu Arg Ser 
45 50 55 

ggg cct aac age etc cac tgc ctg ggc tea etc agg gag cgc etc att 398 

Gly Pro Asn Ser Leu His Cys Leu Gly Ser Leu Arg Glu Arg Leu lie 

60 65 70 75 

35 att tgg gca gee atg gat tct ate cca gec cca tea tea gtt cag gga 446 

lie Trp Ala Ala Met Asp Ser lie Pro Ala Pro Ser Ser Val Gin Gly 

80 85 90 

cac aac ctg act gaa gat gec aga cat cct gag agt tgg cag aac aca 494 

His Asn Leu Thr Glu Asp Ala Arg His Pro Glu Ser Trp Gin Asn Thr 

40 95 100 105 

99 a 99 c tat tct 9 aa 99 a 9 at 9 ca 9 ta tca ca 9 cca ca 9 at 9 9 ca cta 542 

Gly Gly Tyr Ser Glu Gly Asp Ala Val Ser Gin Pro Gin Met Ala Leu 

110 115 120 

gag gag gtg tca gtg tca gat cca ctg gca age aac caa gga cag tca 590 

45 Glu Glu Val Ser Val Ser Asp Pro Leu Ala Ser Asn Gin Gly Gin Ser 
125 130 135 

etc cca gga tec tca agg gag cac atg gca cag tgg gaa gtg aga age 63 8 

Leu Pro Gly Ser Ser Arg Glu His Met Ala Gin Trp Glu Val Arg Ser 

140 145 150 155 

50 cag acc cat gtt cca aac aga gaa cct gtt cag gca ctg cct tec tct 686 

Gin Thr His Val Pro Asn Arg Glu Pro Val Gin Ala Leu Pro Ser Ser 

160 165 170 

gec age egg aaa cgt ctg gac aag aaa cgt tca gtg cct gta gec act 734 

Ala Ser Arg Lys Arg Leu Asp Lys Lys Arg Ser Val Pro Val Ala Thr 

55 175 180 185 

gta gaa ctg gaa gaa aag agg ttc aga act ctg cct tta gtg cca age 782 

Val Glu Leu Glu Glu Lys Arg Phe Arg Thr Leu Pro Leu Val Pro Ser 

190 195 200 

ccc cta caa ggc ctg acc aat cag gat tta caa gag gga gaa gat tgg 83 0 

60 Pro Leu Gin Gly Leu Thr Asn Gin Asp Leu Gin Glu Gly Glu Asp Trp 
205 210 215 

gag caa gaa gat gag gac atg gac ccc aga tta gaa cac agt tec tca 878 

Glu Gin Glu Asp Glu Asp Met Asp Pro Arg Leu Glu His Ser Ser Ser 

10 
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10 



15 



20 



25 



30 



35 



40 



220 225 230 

gtt caa gaa gat tct gaa tec cca agt cct gaa 
Val Gin Glu Asp Ser Glu Ser Pro Ser Pro Glu 

240 245 
etc ctg caa tac agg gec ate cac agt gca gaa 
Leu Leu Gin Tyr Arg Ala lie His Ser Ala Glu 

255 260 
gag cag gac ttt gag aca gat tat get gaa tac 
Glu Gin Asp Phe Glu Thr Asp Tyr Ala Glu Tyr 

270 275 
cgt gtt ggg act gca age caa agg ttc ata gag 
Arg Val Gly Thr Ala Ser Gin Arg Phe lie Glu 

285 290 
aaa aga gtt egg cga gga act cca gaa tac aag 
Lys Arg Val Arg Arg Gly Thr Pro Glu Tyr Lys 
300 305 310 

ata ate cag gaa tat aaa aag ttc agg aag cag 
lie lie Gin Glu Tyr Lys Lys Phe Arg Lys Gin 

320 325 
gaa gaa aag cgt cgc tgt gag tac ctt cac cag 
Glu Glu Lys Arg Arg Cys Glu Tyr Leu His Gin 

335 340 
aaa ggt etc ate ctg gag ttt gag gaa aag aac 
Lys Gly Leu lie Leu Glu Phe Glu Glu Lys Asn 

350 355 
tgaagttatc aagggaattt ttgagectet gcttagtgaa 
tataaactaa atagaatgea actatctget tttcttatgc 
ggcaagtaga gagctgetet aggttcttga ggtttggttt 
atgggcactg tgcaaagact ccatagctgt gectaggagt 
ttggcttttt tacctttagt teagecaagt cattttcaag 
ttcaggataa aataatgagg acattagaca aaccaaacta 
cctctctaag gaaacagtaa taataacttc tgataagagt 
ctggatataa tgggaaaggg cctgggtgtt acccatgtac 
catggctaaa aaattaaaaa aaaaaaaaaa aa 

<210> 8 

<211> 1510 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 190 . . 1398 



gat ata cca 
Asp lie Pro 



cag 
Gin 

cgc 
Arg 

ctg 
Leu 
295 

gtc 
Val 



caa 
Gin 

ate 
He 
280 
gga 
Gly 



cat 
His 
265 
ctg 
Leu 

gca 
Ala 



ctg gaa 
Leu Glu 



tac cca agt 
Tyr Pro Ser 

aaa ttg tec 
Lys Leu Ser 
345 

agg ggc age 
Arg Gly Ser 

360 
acacaaagga 
tgaccactgg 
tcattattaa 
ctaggaaaag 
tcctgagaaa 
agtgaatttt 
taaaagaact 
tgaaaatgaa 



235 
gac tac 
Asp Tyr 
250 

gec tat 
Ala Tyr 

cat gec 
His Ala 

gag att 
Glu He 

gac aag 
Asp Lys 
315 
tac aga 
Tyr Arg 
330 

cac att 
His He 



acaaagcagc 
agtccatggt 
tttttagggt 
tgacagaggc 
tgacatcatc 
agectggtag 
tgtagcatac 
cttttaccaa 



926 



974 



1022 



1070 



1118 



1166 



1214 



1256 



1316 
1376 
1436 
1496 
1556 
1616 
1676 
1736 
1768 



45 <220> 

<221> sig_peptide 

<222> 190 . . 252 

<223> Von Heijne matrix 

score 5.8172934575094 
50 seq ALLWAQEVGQVLA/GR 



<400> 8 

acggttgccc tggcagcgcg cgaggctggt gagteggcag ccctgtggca gccggcgggc 6 0 

tggtttccat ggttgcacga ttaggaacca ccagctgctg catcccatgg ccaggggtgg 12 0 
55 cgtccaggtg gcagagcagc taggaacgea aggectgaac ctggggccag acaccctgct 180 
ctcccggcc atg gtc aac gac cct cca gta cct gec tta ctg tgg gec cag 231 
Met Val Asn Asp Pro Pro Val Pro Ala Leu Leu Trp Ala Gin 
-20 -15 -10 

gag gtg ggc caa gtc ttg gca ggc cgt gec cgc agg ctg ctg ctg cag 2 79 

60 Glu Val Gly Gin Val Leu Ala Gly Arg Ala Arg Arg Leu Leu Leu Gin 
-5 15 
ttt ggg gtg etc ttc tgc ace ate etc ctt ttg etc tgg gtg tct gtc 327 
Phe Gly Val Leu Phe Cys Thr He Leu Leu Leu Leu Trp Val Ser Val 
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10 15 20 25 

ttc etc tat ggc tec ttc tac tat tec tat atg ccg aca gtc age cac 375 

Phe Leu Tyr Gly Ser Phe Tyr Tyr Ser Tyr Met Pro Thr Val Ser His 
30 35 40 

5 etc age cct gtg cat ttc tac tac agg acc gac tgt gat tec tec acc 423 

Leu Ser Pro Val His Phe Tyr Tyr Arg Thr Asp Cys Asp Ser Ser Thr 

45 50 55 

acc tea etc tgc tec ttc cct gtt gec aat gtc teg ctg act aag ggt 471 

Thr Ser Leu Cys Ser Phe Pro Val Ala Asn Val Ser Leu Thr Lys Gly 

10 60 65 70 

gga cgt gat egg gtg ctg atg tat gga cag ccg tat cgt gtt acc tta 519 

Gly Arg Asp Arg Val Leu Met Tyr Gly Gin Pro Tyr Arg Val Thr Leu 

75 80 85 

gag ctt gag ctg cca gag tec cct gtg aat caa gat ttg ggc atg ttc 567 

15 Glu Leu Glu Leu Pro Glu Ser Pro Val Asn Gin Asp Leu Gly Met Phe 

90 95 100 105 

ttg gtc acc att tec tgc tac acc aga ggt ggc cga ate ate tec act 615 

Leu Val Thr lie Ser Cys Tyr Thr Arg Gly Gly Arg lie lie Ser Thr 
110 115 120 

20 tct teg cgt teg gtg atg ctg cat tac cgc tea gac ctg etc cag atg 663 

Ser Ser Arg Ser Val Met Leu His Tyr Arg Ser Asp Leu Leu Gin Met 

125 130 135 

ctg gac aca ctg gtc ttc tct age etc ctg eta ttt ggc ttt gca gag 711 

Leu Asp Thr Leu Val Phe Ser Ser Leu Leu Leu Phe Gly Phe Ala Glu 

25 140 145 150 

cag aag cag ctg ctg gag gtg gaa etc tac gca gac tat aga gag aac 759 

Gin Lys Gin Leu Leu Glu Val Glu Leu Tyr Ala Asp Tyr Arg Glu Asn 

155 160 165 

teg gtg agt gag tac gtg ccg acc act gga gcg ate att gag ate cac 807 

30 Ser Val Ser Glu Tyr Val Pro Thr Thr Gly Ala lie lie Glu lie His 

170 175 180 185 

age aag cgc ate cag ctg tat gga gee tac etc cgc ate cac gcg cac 855 

Ser Lys Arg lie Gin Leu Tyr Gly Ala Tyr Leu Arg lie His Ala His 
190 195 200 

35 ttc act ggg etc aga tac ctg eta tac aac ttc ccg atg acc tgc gec 903 

Phe Thr Gly Leu Arg Tyr Leu Leu Tyr Asn Phe Pro Met Thr Cys Ala 

205 210 215 

ttc ata ggt gtt gec age aac ttc acc ttc etc age gtc ate gtg etc 951 

Phe lie Gly Val Ala Ser Asn Phe Thr Phe Leu Ser Val lie Val Leu 

40 220 225 230 

ttc age tac atg cag tgg gtg tgg ggg ggc ate tgg ccc cga cac cgc 999 

Phe Ser Tyr Met Gin Trp Val Trp Gly Gly lie Trp Pro Arg His Arg 

235 240 245 

ttc tct ttg cag gtt aac ate cga aaa aga gac aat tec egg aag gaa 1047 

45 Phe Ser Leu Gin Val Asn lie Arg Lys Arg Asp Asn Ser Arg Lys Glu 

250 255 260 265 

gtc caa cga agg ate tct get cat cag cca ggt gca ggg cct gaa ggc 1095 

Val Gin Arg Arg lie Ser Ala His Gin Pro Gly Ala Gly Pro Glu Gly 
270 275 280 

50 cag gag gag tea act ccg caa tea gat gtt aca gag gat ggt gag age 1143 

Gin Glu Glu Ser Thr Pro Gin Ser Asp Val Thr Glu Asp Gly Glu Ser 

285 290 295 

cct gaa gat ccc tea ggg aca gag ggt cag ctg tec gag gag gag aaa 1191 

Pro Glu Asp Pro Ser Gly Thr Glu Gly Gin Leu Ser Glu Glu Glu Lys 

55 300 305 310 

cca gat cag cag ccc ctg age gga gaa gag gag eta gag cct gag gec 12 3 9 

Pro Asp Gin Gin Pro Leu Ser Gly Glu Glu Glu Leu Glu Pro Glu Ala 

315 320 325 

agt gat ggt tea ggc tec tgg gaa gat gca get ttg ctg acg gag gec 12 87 

60 Ser Asp Gly Ser Gly Ser Trp Glu Asp Ala Ala Leu Leu Thr Glu Ala 

330 335 340 345 

aac ctg cct get cct get cct get tct get tct gec cct gtc eta gag 1335 

Asn Leu Pro Ala Pro Ala Pro Ala Ser Ala Ser Ala Pro Val Leu Glu 
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10 



15 



350 355 360 

act ctg ggc age tct gaa cct get ggg ggt get etc cga cag cgc ccc 1383 
Thr Leu Gly Ser Ser Glu Pro Ala Gly Gly Ala Leu Arg Gin Arg Pro 

365 370 375 

acc tgc tct agt tec tgaagaaaag gggcagactc ctcacattcc agcactttcc 1438 
Thr Cys Ser Ser Ser 
380 

cacctgactc ctctcccctc gtttttcctt caataaacta ttttgtgtca gctccaaaaa 1498 
aaaaaaaaaa aa 1510 

<210> 9 

<211> 882 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 78 . . 



410 



20 <220> 

<221> sig_peptide 

<222> 78 . . 155 

<223> Von Heijne matrix 

score 10.0731536331164 
25 seq LWLALVSCILTQA/SA 

<400> 9 

atggctggcc agaggaggaa cgctttgtgt tetcategga gctgcatggg aagtctgeat 60 
acagcaaagt gaectge atg cct cac ctt atg gaa agg atg gtg ggc tct 110 

30 Met Pro His Leu Met Glu Arg Met Val Gly Ser 

-25 -20 
ggc etc ctg tgg ctg gec ttg gtc tec tgc att ctg acc cag gca tct 158 
Gly Leu Leu Trp Leu Ala Leu Val Ser Cys lie Leu Thr Gin Ala Ser 
-15 -10 -5 1 

35 gca gtg cag cga ggt tat gga aac ccc att gaa gec agt teg tat ggg 206 
Ala Val Gin Arg Gly Tyr Gly Asn Pro lie Glu Ala Ser Ser Tyr Gly 

5 10 15 

ctg gac ctg gac tgc gga get cct ggc acc cca gag get cat gtc tgt 254 
Leu Asp Leu Asp Cys Gly Ala Pro Gly Thr Pro Glu Ala His Val Cys 

40 20 25 30 

ttt gac ccc tgt cag aat tac acc etc eta gat ttg ggg ccc ate act 302 
Phe Asp Pro Cys Gin Asn Tyr Thr Leu Leu Asp Leu Gly Pro lie Thr 

35 40 45 

egg aga ggt gca cag tct ccc ggt gtc atg aat gga acc cct age act 350 

45 Arg Arg Gly Ala Gin Ser Pro Gly Val Met Asn Gly Thr Pro Ser Thr 
50 55 60 65 

gca ggg ttc ctg gtg gec tgg cct atg gtc etc ctg act gtc etc ctg 398 
Ala Gly Phe Leu Val Ala Trp Pro Met Val Leu Leu Thr Val Leu Leu 
70 75 80 

50 get tgg ctg ttc tgagagctcc gctgagcatc tggccttgaa gtttgtgttc 450 
Ala Trp Leu Phe 
85 

ttccctctgg caatggctcc cttcagcact tctgctttcc actccaattc acacaggctt 510 
ggtattaaca gaatcaaggc caggctaggt taggaaaagg gaagagcttt caccttcttt 570 

55 aaaactctcg getgggegea gtggctcatg cctgtaatcc cagcattttg ggaggctgag 630 
gcaggtggat cacctgaggt cagcagttca aaatcagect ggccaaaatg ctgaaactcc 690 
gtctctacta aaaatacaaa aattagccag gcatggtgac aggegectgt aatcccagct 750 
actegggagg ccaaggcagg agaattgetc gaactcaggg ggtggaggtt gcagtgagtt 810 
gagattgtgc cattgcactc cagcctgggc aacagagcaa gactctgtct caggcaaaaa 870 

60 aaaaaaaaaa aa 8 82 



<210> 10 
<211> 1849 
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<212> DNA 

<213> Homo sapiens 

<220> 
5 <221> CDS 

<222> 84 . .299 

<220> 

<221> sig_peptide 
10 <222> 84 . .134 

<223> Von Heijne matrix 

score 3.86022363031904 

seq GFVAALVAGGVAG/VS 

15 <400> 10 

aaacatggcg gcgcccagcg cgcgaggacg tgatccgctt ctgctccggc ttggattgta 60 

gccttgacga ggtctgagcg acc atg gac egg ccg ggg ttc gtg gca gcg ctg 113 

Met Asp Arg Pro Gly Phe Val Ala Ala Leu 
-15 -10 

20 gtg get ggt ggg gta gca ggt gtt tct gtt gac ttg ata tta ttt cct 161 
Val Ala Gly Gly Val Ala Gly Val Ser Val Asp Leu lie Leu Phe Pro 

-5 15 

ctg gat acc att aaa acc agg ctg cag agt ccc caa gga ttt aat aag 209 
Leu Asp Thr lie Lys Thr Arg Leu Gin Ser Pro Gin Gly Phe Asn Lys 

25 10 15 20 25 

get ggt ggt ttt cat gga ata tat get ggc gtt cct tct get get att 257 
Ala Gly Gly Phe His Gly lie Tyr Ala Gly Val Pro Ser Ala Ala lie 

30 35 40 

gga tec ttt cct aat ggt tgc ctg cct gat teg agt tec ate 299 

30 Gly Ser Phe Pro Asn Gly Cys Leu Pro Asp Ser Ser Ser lie 
45 50 55 

tgaagtggtt aagcagaggg cacaggtatc tgcttctaca agaacatttc agattttctc 359 

taacatctta tatgaagagg gtatccaagg gttgtatcga ggctataaaa gcacagtttt 419 

aagagagatt cctttttctt tggtccagtt tcccttatgg gagtccttaa aagccctctg 479 

35 gtcctggagg caggatcatg tggtggattc ttggcagtca gcagtctgtg gagcttttgc 53 9 

aggtggattt gccgctgcag tcaccacccc tetagaegtg gcgaagacaa gaattatget 599 

ggcaaaggct ggctccagca ctgctgatgg gaatgtgctc tctgtcctgc atggggtctg 659 

gcggtcacag gggctggcag gattatttgc aggtgtcttc cctcgaatgg cagccatcag 719 

tctgggaggt ttcatctttc tgggggctta tgaccgaacg cacagcttgc tgttggaagt 77 9 

40 tggcagaaag agtccttgaa gcagagacaa gcctcacctc cacttctgtc aagagagggg 839 

cctgcagtgt aaaccctctt ccgctgagca gctgtctgaa ctataggccc cagtgctgaa 899 

gaccagttgt gctaagatac eggcatggag attgtgccat ccgtggtata ggctggctgg 959 

tatgaagtca ttggcctgta tgecagagag ctaagagaag aaaacggggt ctgtggcagt 1019 

actctgaaca atttcctcag aacctcttaa taaataagtt tggtaatget gaggecagge 1079 

45 cttttagagc tttcatttga tctgtatctg atctttcatt tcctgccacc tgatggtgga 1139 

ttcagcagaa ggcaagatgg ttataattct aaaagaatag cttgtttgtt tgtttgtttg 1199 

ggaaaaggag acttggggaa gagttgtgta tgtgggtgtt tctcccccta gttaattcct 1259 

gttgtgtaag ggtaggcttt gttgaaaaag aaagaaagat tgaactacag gtgeatagea 1319 

agcactcttt ctgggtaact aggctgetgg ttttaattac cctcagattt cacccataaa 1379 

50 aacgeacaat tgtattattt tacagagatg tgtccagcgc cccctgtggt gtgtgagaga 143 9 

aagcagctgc aactcaagtg actaggtggg cccagctggc ttcgtgcagg agggcaeggt 14 99 

gggtgagcca ttctcgccat tctcatgtca gactgaaagg agggcctggg ccagctttga 1559 

aaaggcagga tgaaatggaa aggtcaccac acttagggat tttagacctt gactaacaag 1619 

ctccaggtgt agaaaaattc aaaacaaaat gtcaggaatc tagcagtgtt gtctgccctg 1679 

55 gagcaaacaa acagtatgtg attttgette gectattttt tttttctttt ttgggggaag 1739 

ataattaaag gcagaatgac tgcgtttgta aaagaaggac caccaactat actgacattt 1799 

ataaatgaac ctttattaaa gacacttcaa tgcaaaaaaa aaaaaaaaaa 1849 



<210> 11 
60 <211> 565 
<212> DNA 

<213> Homo sapiens 
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<220> 
<221> CDS 
<222> 55 . -468 

5 <220> 

<221> sig_peptide 

<222> 55 . - 99 

<223> Von Heijne matrix 

score 8.96936032049195 
10 seq FTTLLFLAAVAGA/LV 

<400> 11 

attccccaga ccttctgcag attctgtggt tatactcact cctcatccca aaga atg 57 

Met 

15 -15 

aaa ttt acc act etc etc ttc ttg gca get gta gca ggg gec ctg gtc 105 
Lys Phe Thr Thr Leu Leu Phe Leu Ala Ala Val Ala Gly Ala Leu Val 

-10 -5 1 

tat get gaa gat gee tec tct gac teg acg ggt get gat cct gec cag 153 
20 Tyr Ala Glu Asp Ala Ser Ser Asp Ser Thr Gly Ala Asp Pro Ala Gin 
5 10 15 

gaa get ggg acc tct aag cct aat gaa gag ate tea ggt cca gca gaa 2 01 

Glu Ala Gly Thr Ser Lys Pro Asn Glu Glu lie Ser Gly Pro Ala Glu 
20 25 30 

25 cca get tea ccc cca gag aca acc aca aca gec cag gag act teg gcg 24 9 

Pro Ala Ser Pro Pro Glu Thr Thr Thr Thr Ala Gin Glu Thr Ser Ala 
35 40 45 50 

gca gca gtt cag ggg aca gec aag gtc acc tea age agg cag gaa eta 2 97 

Ala Ala Val Gin Gly Thr Ala Lys Val Thr Ser Ser Arg Gin Glu Leu 
30 55 60 65 

aac ccc ctg aaa tec ata gtg gag aaa agt ate tta eta aca gaa caa 345 
Asn Pro Leu Lys Ser lie Val Glu Lys Ser lie Leu Leu Thr Glu Gin 

70 75 80 

gec ctt gca aaa gca gga aaa gga atg cac gga ggc gtg cca ggt gga 3 93 

35 Ala Leu Ala Lys Ala Gly Lys Gly Met His Gly Gly Val Pro Gly Gly 
85 90 95 

aaa caa ttc ate gaa aat gga agt gaa ttt gca caa aaa tta ctg aag 441 
Lys Gin Phe lie Glu Asn Gly Ser Glu Phe Ala Gin Lys Leu Leu Lys 
100 105 110 

40 aaa ttc agt eta tta aaa cca tgg gca tgagaagctg aataatggga 488 
Lys Phe Ser Leu Leu Lys Pro Trp Ala 
115 120 

tcattggact taaagectta aatacccttg tageccagag ctattaaaac gaaagcatcc 54 8 
aaaaaaaaaa aaaaaaa 565 

45 

<210> 12 
<211> 1663 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 152 . .475 

55 <220> 

<221> sig_peptide 

<222> 152 . . 244 

<223> Von Heijne matrix 

score 10.0910253445132 
60 seq LVLLLVTRSPVNA/CL 

<400> 12 

atgtgtctgc tgccgccatt gtgeggeget ggtcccctca gagggttcct gctgctgccg 60 



50 
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gtgccttgga ccctccccct cgcttctcgt tctactgccc caggagcccg gcgggtccgg 
gactcccgtc cgtgccggtg cgggcgccgg c atg tgg ctg tgg gag gac cag 

Met Trp Leu Trp Glu Asp Gin 
-30 -25 
5 ggc ggc etc ctg ggc cct ttc tec ttc ctg ctg eta gtg ctg ctg ctg 
Gly Gly Leu Leu Gly Pro Phe Ser Phe Leu Leu Leu Val Leu Leu Leu 

-20 -15 -10 

gtg acg egg age ccg gtc aat gee tgc etc etc acc ggc age etc ttc 
Val Thr Arg Ser Pro Val Asn Ala Cys Leu Leu Thr Gly Ser Leu Phe 
10-5 15 

gtt eta ctg cgc gtc ttc age ttt gag ccg gtg ccc tct tgc agg gec 
Val Leu Leu Arg Val Phe Ser Phe Glu Pro Val Pro Ser Cys Arg Ala 

10 15 20 

ctg cag gtg etc aag ccc egg gac cgc att tct gec ate gee cac cgt 
15 Leu Gin Val Leu Lys Pro Arg Asp Arg lie Ser Ala lie Ala His Arg 
25 30 35 40 

99c ggc age aam sag gcg ccc gag aac acg ctg gcg gec att egg cag 
Gly Gly Ser Xaa Xaa Ala Pro Glu Asn Thr Leu Ala Ala lie Arg Gin 
45 50 55 

20 eta aga atg gag caa cag gcg tgg agt tgg aca ttg agt tta ctt ctg 
Leu Arg Met Glu Gin Gin Ala Trp Ser Trp Thr Leu Ser Leu Leu Leu 

60 65 70 

acg gga ttc ctg tct taatgeaega taacacagta gataggacga ctgatgggac 
Thr Gly Phe Leu Ser 
25 * 75 

tgggcgattg tgtgatttga catttgaaca aattaggaag ctgaatcctg cagcaaacca 
cagactcagg aatgatttcc ctgatgaaaa gatccctacc ctaagggaag ctgttgcaga 
gtgectaaac cataacctca caatcttctt tgatgtcaaa ggccatgcac acaaggctac 
tgaggctcta aagaaaatgt atatggaatt tcctcaactg tataataata gtgtggtctg 

30 ttctttcttg ccagaagtta tctacaaggt aacatteggg atttttcttg tacatattag 
atgagacaaa cagateggga tgtaataaca gcattaactc acagaccttg gagectaage 
catacaggag atgggaaacc aegctatgat actttctgga aacattttat atttgttatg 
atggacattt tgctcgattg gagcatgeat aatatcttgt ggtacctgtg tggaatttca 
gctttcctca tgcaaaagga ttttgtatcc ccggcctact tgaagaagtg gtcagctaaa 

35 ggaatccagg ttgttggttg gactgttaat acctttgatg aaaagagtta ctacgaatcc 
catcttggtt ccagctatat cactgacagc atggtagaag actgcgaacc tcacttctag 
actttcaegg tgggacgaaa egggttcaga aactgecagg ggcctcatac agggatatca 
aaataccctt tgtgctagcc caggccctgg ggaatcaggt gactcacaca aatgeaatag 
ttggtcactg catttttacc tgaaccaaag ctaaacccgg tgttgccacc atgcaccatg 

40 geatgecaga gttcaacact gttgctcttg aaaatctggg tctgaaaaaa cgcacaagag 
cccctgccct gccctagctg aggcacacag ggagacccag tgaggataag cacagattga 
attgtacaat ttgcagatgc agatgtaaat gcatgggaca tgcatgataa ctcagagttg 
acattttaaa acttgccaca cttatttcaa atatttgtac tcagctatgt taacatgtac 
tgtagacatc aaacttgtgg ccatactaat aaaattatta aaaggagcac taaaaaaaaa 

45 aaaaaaaa 



120 
172 



220 



268 



316 



364 



412 



460 



515 



575 
635 
695 
755 
815 
875 
935 
995 
1055 
1115 
1175 
1235 
1295 
1355 
1415 
1475 
1535 
1595 
1655 
1663 



<210> 13 
<211> 744 
<212> DNA 
50 <213> Homo sapiens 



55 



60 



<220> 
<221> CDS 
<222> 112. 



.552 



<220> 

<221> sig_peptide 

<222> 112 . . 183 

<223> Von Heijne matrix 

score 11.7298925418815 
seq FVLGLGLTPPTLA/QD 



<400> 13 
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tcacaactgg aacccatctc caggaacaaa cagctggaac ccatctcccg ttgaagggaa 60 
actgccagat ttttgtaaga ttcttcctcc tgggagcctg tgttggaaga g atg gtg 117 

Met Val 

at 9 99 c ct 9 99 c 9 fct tt: 9 tfc 9 tt: 9 9 tc ttc 9 fc 9 ct 9 99 fc ct 9 99 fc ct 9 165 
5 Met Gly Leu Gly Val Leu Leu Leu Val Phe Val Leu Gly Leu Gly Leu 
-20 -15 -10 

acc cca ccg acc ctg get cag gat aac tec agg tac aca cac ttc ctg 213 
Thr Pro Pro Thr Leu Ala Gin Asp Asn Ser Arg Tyr Thr His Phe Leu 
-5 15 10 

10 acc cag cac tat gat gec aaa cca cag ggc egg gat gac aga tac tgt 261 
Thr Gin His Tyr Asp Ala Lys Pro Gin Gly Arg Asp Asp Arg Tyr Cys 

15 20 25 

gaa age ate atg agg aga egg ggc ctg acc tea ccc tgc aaa gac ate 309 
Glu Ser lie Met Arg Arg Arg Gly Leu Thr Ser Pro Cys Lys Asp lie 
15 30 35 40 

aac aca ttt att cat ggc aac aag cgc acg ate aag gee ate tgt gaa 357 
Asn Thr Phe lie His Gly Asn Lys Arg Thr lie Lys Ala lie Cys Glu 

45 50 55 

aac aag aat gga aac cct cac aga gaa aac eta aga ata age aag tct 405 
20 Asn Lys Asn Gly Asn Pro His Arg Glu Asn Leu Arg lie Ser Lys Ser 
60 65 70 

tct ttc cag gtc acc act tgc aag eta cat gga ggt tec ccc tgg cct 453 
Ser Phe Gin Val Thr Thr Cys Lys Leu His Gly Gly Ser Pro Trp Pro 
75 80 85 90 

25 cca tgc cag tac cga gec aca gcg ggg ttc aga aac gtt gtt gtt get 501 
Pro Cys Gin Tyr Arg Ala Thr Ala Gly Phe Arg Asn Val Val Val Ala 

95 100 105 

tgt gaa aat ggc tta cct gtc cac ttg gat cag tea att ttc cgt cgt 549 
Cys Glu Asn Gly Leu Pro Val His Leu Asp Gin Ser lie Phe Arg Arg 
30 110 115 120 

ccg taaccagegg gcccctggtc aagtgctggc tctgctgtcc ttgccttcca 602 
Pro 

tttcccctct gcacccagaa cagtggtggc aacattcatt gccaagggcc caaagaaaga 662 
gctacctgga ccttttgttt tctgtttgac aacatgttta ataaataaaa atgtcttgat 722 
35 atcagcaaaa aaaaaaaaaa aa 744 



<210> 14 

<211> 1759 

<212> DNA 

40 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 101. .1243 

45 

<220> 

<221> sig_peptide 
<222> 101. .199 
<223> Von Heijne matrix 
50 score 3.57142340200611 

seq FLCLGMALCPRQA/TR 



<400> 14 

gtagagtgct gaaggtcctg ccaacggctc tcttggcgtc teaaegtteg gatcagcagc 60 
55 ttttttccat tctctctctc cacttcttca gtgagcagee atg agt tgg act gtg 115 

Met Ser Trp Thr Val 
-30 

cct gtt gtg egg gee age cag aga gtg age teg gtg gga gcg aat ttc 163 
Pro Val Val Arg Ala Ser Gin Arg Val Ser Ser Val Gly Ala Asn Phe 
60 -25 -20 -15 

eta tgc ctg ggg atg gee ctg tgt ccg cgt caa gca acg cgc ate ccg 211 
Leu Cys Leu Gly Met Ala Leu Cys Pro Arg Gin Ala Thr Arg lie Pro 
-10 -5 1 
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etc aac ggc acc tgg etc ttc acc ecc gtg age aag atg gcg act gtg 259 

Leu Asn Gly Thr Trp Leu Phe Thr Pro Val Ser Lys Met Ala Thr Val 

5 10 15 20 

aag agt gag ctt att gag cgt ttc act tec gag aag ecc gtt cat cac 307 

5 Lys Ser Glu Leu lie Glu Arg Phe Thr Ser Glu Lys Pro Val His His 

25 30 35 

agt aag gtc tec ate ata gga act gga teg gtg ggc atg gee tgc get 355 

Ser Lys Val Ser lie lie Gly Thr Gly Ser Val Gly Met Ala Cys Ala 

40 45 50 

10 ate age ate tta tta aaa ggc ttg agt gat gaa ctt gee ctt gtg gat 403 

lie Ser lie Leu Leu Lys Gly Leu Ser Asp Glu Leu Ala Leu Val Asp 

55 60 65 

ctt gat gaa gac aaa ctg aag ggt gag acg atg gat ctt caa cat ggc 451 

Leu Asp Glu Asp Lys Leu Lys Gly Glu Thr Met Asp Leu Gin His Gly 

15 70 75 80 

age cct ttc acg aaa atg cca aat att gtt tgt age aaa gat tac ttt 499 

Ser Pro Phe Thr Lys Met Pro Asn lie Val Cys Ser Lys Asp Tyr Phe 

85 90 95 100 

gtc aca gca aac tec aac eta gtg att ate aca gca ggt gca cgc caa 547 

20 Val Thr Ala Asn Ser Asn Leu Val lie lie Thr Ala Gly Ala Arg Gin 

105 110 115 

gaa aag gga gaa acg cgc ctt aat tta gtc cag cga aat gtg gee ate 595 

Glu Lys Gly Glu Thr Arg Leu Asn Leu Val Gin Arg Asn Val Ala lie 

120 125 130 

25 ttc aag tta atg att tec agt att gtc cag tac age ecc cac tgc aaa 643 

Phe Lys Leu Met lie Ser Ser lie Val Gin Tyr Ser Pro His Cys Lys 

135 140 145 

ctg att att gtt tec aat cca gtg gat ate tta act tat gta get tgg 691 

Leu lie lie Val Ser Asn Pro Val Asp lie Leu Thr Tyr Val Ala Trp 

30 150 155 160 

aag ttg agt gca ttt ecc aaa aac cgt att att gga age ggc tgt aat 73 9 

Lys Leu Ser Ala Phe Pro Lys Asn Arg lie lie Gly Ser Gly Cys Asn 

165 170 175 180 

ctg gat act get cgt ttt cgt ttc ttg att gga caa aag ctt ggt ate 787 

35 Leu Asp Thr Ala Arg Phe Arg Phe Leu lie Gly Gin Lys Leu Gly lie 

185 190 195 

cat tct gaa age tgc cat gga tgg ate etc gga gag cat gga gac tea 835 

His Ser Glu Ser Cys His Gly Trp lie Leu Gly Glu His Gly Asp Ser 

200 205 210 

40 agt gtt cct gtg tgg agt gga gtg aac ata get ggt gtc cct ttg aag 883 

Ser Val Pro Val Trp Ser Gly Val Asn lie Ala Gly Val Pro Leu Lys 

215 220 225 

gat ctg aac tct gat ata gga act gat aaa gat cct gag caa tgg aaa 931 

Asp Leu Asn Ser Asp lie Gly Thr Asp Lys Asp Pro Glu Gin Trp Lys 

45 230 235 240 

aat gtc cac aaa gaa gtg act gca act gec tat gag att att aaa atg 979 

Asn Val His Lys Glu Val Thr Ala Thr Ala Tyr Glu lie lie Lys Met 

245 250 255 260 

aaa ggt tat act tct tgg gec att ggc eta tct gtg gee gat tta aca 1027 

50 Lys Gly Tyr Thr Ser Trp Ala lie Gly Leu Ser Val Ala Asp Leu Thr 

265 270 275 

gaa agt att ttg aag aat ctt agg aga ata cat cca gtt tec acc ata 1075 

Glu Ser lie Leu Lys Asn Leu Arg Arg lie His Pro Val Ser Thr lie 

280 285 290 

55 att aag ggc etc tat gga ata gat gaa gaa gta ttc etc agt att cct 1123 

lie Lys Gly Leu Tyr Gly lie Asp Glu Glu Val Phe Leu Ser lie Pro 

295 300 305 

tgt ate ctg gga gag aac ggt att acc aac ctt ata aag ata aag ctg 1171 

Cys lie Leu Gly Glu Asn Gly lie Thr Asn Leu lie Lys lie Lys Leu 

60 310 315 320 

acc cct gaa gaa gag gec cat ctg aaa aaa agt gca aaa aca etc tgg 1219 

Thr Pro Glu Glu Glu Ala His Leu Lys Lys Ser Ala Lys Thr Leu Trp 

325 330 335 340 
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gaa att cag aat aag ctt aag ctt taaagttgcc taaaactacc attccgaaat 1273 
Glu lie Gin Asn Lys Leu Lys Leu 
345 

tattgaagag atcatagata caggattata taacgaaatt ttgaataaac ttgaattcct 1333 

5 aaaagatgga aacaggaaag taggtagagt gattttccta tttatttagt cctccagctc 13 93 

ttttattgag catccacgtg ctggacgata cttatttaca attcctaagt atttttggta 1453 

cctctgatgt agcagcactt gccatgttat atatatgtag ttggcatttg gttcccaaaa 1513 

agtaggatgt aggtatttat tgtgttctag aaattccgac tcttttcatt agatatatgc 1573 

tatttctttc attcttgctg gtttatacct atgttcattt atatgctgta aaaaagtagt 1633 

10 agcttcttct acaatgtaaa aataaatgta catacaaaaa aatgcagtag tatatacaat 1693 

cttttgtttt gcttcctttg atagttaata aattccgttt gttgaatcaa taaaaaaaaa 1753 

aaaaaa 1759 

<210> 15 
15 <211> 1755 
<212> DNA 

<213> Homo sapiens 

<220> 
20 <221> CDS 

<222> 101. .517 

<220> 

<221> sig_peptide 
25 <222> 101. .199 

<223> Von Heijne matrix 

score 3.57613483592743 

seq FLCLGMALCLRQA/TR 

30 <400> 15 

gtagagtgct gaaggtcctg ccaacggctc tcttggcgtc tcaacgttcg gatcagcagc 60 

ttttttccat tctctctctc cacttcttca gtgagcagcc atg agt tgg act gtg 115 

Met Ser Trp Thr Val 
-30 

35 cct gtt gtg egg gec age cag aga atg age teg gtg gga gcg aat ttc 163 

Pro Val Val Arg Ala Ser Gin Arg Met Ser Ser Val Gly Ala Asn Phe 

-25 -20 -15 

eta tgc ctg ggg atg gec ctg tgt ctg cgt caa gca acg cgc ate ccg 211 

Leu Cys Leu Gly Met Ala Leu Cys Leu Arg Gin Ala Thr Arg lie Pro 

40 -10 -5 1 

etc aac ggc acc tgg etc ttc aca ccc gtg age aag atg gcg act gtg 259 

Leu Asn Gly Thr Trp Leu Phe Thr Pro Val Ser Lys Met Ala Thr Val 

5 10 15 20 

aag agt gag ctt att gag cgt ttc act tec gag aag ccc gtt cat cac 307 

45 Lys Ser Glu Leu lie Glu Arg Phe Thr Ser Glu Lys Pro Val His His 

25 30 35 

agt aag gtc tec ate ata gga act gga teg gtg ggc atg gec tgc get 355 

Ser Lys Val Ser lie lie Gly Thr Gly Ser Val Gly Met Ala Cys Ala 
40 45 50 

50 ate age ate ttg tta aaa ggc ttg agt gat gaa ctt gec ctt gtg gat 403 

lie Ser lie Leu Leu Lys Gly Leu Ser Asp Glu Leu Ala Leu Val Asp 

55 60 65 

ctt gat gaa gac aaa ctg aag ggt gag acg atg gat ctt caa cat ggc 451 

Leu Asp Glu Asp Lys Leu Lys Gly Glu Thr Met Asp Leu Gin His Gly 

55 70 75 80 

age cct ttc acg aaa atg cca ata ttg ttt gta gca aag att act ttg 499 

Ser Pro Phe Thr Lys Met Pro lie Leu Phe Val Ala Lys lie Thr Leu 

85 90 95 100 

tea cag caa act cca acc tagtgattat cacagcaggt gcacgccaag 547 

60 Ser Gin Gin Thr Pro Thr 

105 

aaaagggaga aacgcgcctt aatttagtcc agegaaatgt ggccatcttc aagtaatgat 607 
ttccagtatt gtccagtaca gcccccactg caaactgatt attgtttcca atccagtgga 667 
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tatcttaact 
cggctgtaat 
ttctgaaagc 
gagtggagtg 
5 taaagatcct 
ttaaaatgaa 
gtattttgaa 
gaatagatga 
accttataaa 

10 cactctggga 
attattgaag 
ctaaaagatg 
tcttttattg 
tacctctgat 

15 aaagtaggat 
gctatttctt 
gtagcttctt 
atcttttgtt 
aaaaaaaa 



20 



25 



tatgtagctt 
ctggatactg 
tgccatggat 
aacatagctg 
gagcaggaaa 
aggttatact 
gaatcttagg 
agaagtattc 
gataaagctg 
aattcagaat 
agatcataga 
gaaacaggaa 
agcatccacg 
gtagcagcac 
gtaggtattt 
tcattcttgc 
ctacaatgta 
ttgcttcctt 



<210> 16 

<211> 936 

<212> DNA 

<213> Homo 

<220> 

<221> CDS 

<222> 59. 



sapiens 



853 



ggaagttgag 
ctcgttttcg 
ggatcctcgg 
gtgtcccttt 
aatgtccaca 
tcttgggcca 
agaatacatc 
ctcagtattc 
acccctgaag 
aagcttaagc 
tacaggatta 
agtaggtaga 
tgctggacga 
ttgccatgtt 
attgtgttct 
tggtttatac 
aaaataaatg 
tgatagttaa 



tgcatttccc 
tttcttgatt 
agagcatgga 
gaaggatctg 
aagaagtgac 
ttggcctatc 
cagtttccac 
cttgtatcct 
aagaggccca 
tttaaagttg 
tataacgaaa 
gtgattttcc 
tacttattta 
atatatatgt 
agaaattccg 
ctatgttcat 
tacatacaaa 
taaattccgt 



aaaaaccgta 
ggacaaaagc 
gactcaagtg 
aactctgata 
tgcaactgcc 

tgtggccgat 

cataactaag 
gggagagaac 
tctgaaaaaa 
cctaaaacta 
ttttgaataa 
tatttattta 
caattcctaa 
agttggcatt 
actcttttca 
ttatatgctg 
aaaatgcagt 
ttgttgaatc 



ttattggaag 
ttggtatcca 
ttcctgtgtg 
taggaactga 
tatgagatta 
ttaacagaaa 
ggcctctatg 
ggtattacca 
agtgcaaaaa 
ccattccgaa 
acttgaattc 
gtcctccagc 
gtatttttgg 
tggttcccaa 
ttagatatat 
taaaaaagta 
agtatataca 
aataaaaaaa 



727 
787 
847 
907 
967 
1027 
1087 
1147 
1207 
1267 
1327 
1387 
1447 
1507 
1567 
1627 
1687 
1747 
1755 



30 <220> 

<221> sig_peptide 

<222> 59. .100 

<223> Von Heijne matrix 

score 5.2402423806254 
35 seq NFILFIFIPGVFS/LK 



<400> 16 

agaaaggagg ctctgggtag acgcactaga ttactggata 
atg aat ttt ata ttg ttt att ttt ata cct gga 
40 Met Asn Phe lie Leu Phe lie Phe lie Pro Gly 



55 



60 



-10 



-5 



50 



agt age act ttg aag cct act att gaa gca ttg 
Ser Ser Thr Leu Lys Pro Thr lie Glu Ala Leu 

5 10 
tta aat gaa gat gtt aat aag cag gaa gaa aag 
Leu Asn Glu Asp Val Asn Lys Gin Glu Glu Lys 

20 25 
ccc aat tat get cct get aat gag aaa aat ggc 
Pro Asn Tyr Ala Pro Ala Asn Glu Lys Asn Gly 
35 40 45 

ata aaa caa tat gtg ttc aca aca caa aat cca 
lie Lys Gin Tyr Val Phe Thr Thr Gin Asn Pro 

55 60 
gaa ata tct gtg aga gee aca act gac ctg aat 
Glu lie Ser Val Arg Ala Thr Thr Asp Leu Asn 

70 75 
gga tea ace cca aac gtg cct gca ttt tgg aca 
Gly Ser Thr Pro Asn Val Pro Ala Phe Trp Thr 

85 90 
ata aat gga aca gca gtg gtc atg gat gat aaa 
lie Asn Gly Thr Ala Val Val Met Asp Asp Lys 

100 105 
cca att cca gag tct gat gtg aat get aca cag 



aatcacttca atttccca 
gtt ttt tec tta aaa 
Val Phe Ser Leu Lys 
1 

cct aat gtg eta cct 
Pro Asn Val Leu Pro 
15 

aat gaa gat cat act 
Asn Glu Asp His Thr 
30 

aat tat tat aaa gat 
Asn Tyr Tyr Lys Asp 
5 0 

aat ggc act gag tct 
Asn Gly Thr Glu Ser 
65 

ttt get eta aaa aac 
Phe Ala Leu Lys Asn 
80 

atg tta get aaa get 
Met Leu Ala Lys Ala 
95 

gat caa tta ttt cac 
Asp Gin Leu Phe His 
110 

gga gaa aat cag cca 



58 
106 



154 



202 



250 



298 



346 



394 



442 



490 
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Pro lie Pro Glu Ser Asp Val Asn Ala Thr Gin Gly Glu Asn Gin Pro 
115 120 125 130 

gat eta gag gat ctg aag ate aaa ata atg ctg gga ate teg ttg atg 538 

Asp Leu Glu Asp Leu Lys lie Lys lie Met Leu Gly lie Ser Leu Met 

5 135 140 145 

ace etc etc etc ttt gtg gtc etc ttg gca ttc tgt agt get aca ctg 586 

Thr Leu Leu Leu Phe Val Val Leu Leu Ala Phe Cys Ser Ala Thr Leu 

150 155 160 

tac aaa ctg agg cat ctg agt tat aaa agt tgt gag agt cag tac tct 634 

10 Tyr Lys Leu Arg His Leu Ser Tyr Lys Ser Cys Glu Ser Gin Tyr Ser 

165 170 175 

gtc aac cca gag ctg gec acg atg tct tac ttt cat cca tea gaa ggt 682 

Val Asn Pro Glu Leu Ala Thr Met Ser Tyr Phe His Pro Ser Glu Gly 
180 185 190 

15 gtt tea gat aca tec ttt tec aag agt gca gag age age aca ttt ttg 73 0 

Val Ser Asp Thr Ser Phe Ser Lys Ser Ala Glu Ser Ser Thr Phe Leu 
195 200 205 210 

ggt ace act tct tea gat atg aga aga tea ggc aca aga aca tea gaa 778 

Gly Thr Thr Ser Ser Asp Met Arg Arg Ser Gly Thr Arg Thr Ser Glu 

20 215 220 225 

tct aag ata atg acg gat ate att tec ata ggc tea gat aat gag atg 826 

Ser Lys lie Met Thr Asp lie lie Ser lie Gly Ser Asp Asn Glu Met 

230 235 240 

cat gaa aac gat gag teg gtt acc egg tgaagaaatc aaggaacccg 873 

25 His Glu Asn Asp Glu Ser Val Thr Arg 
245 250 
gtgaagaaat cttattgatg aataaataac tttaattatt ttgtcatcaa aaaaaaaaaa 933 
aaa 936 

30 <210> 17 
<211> 747 
<212> DNA 
<213> Homo sapiens 

35 <220> 

<221> CDS 
<222> 73 . . 672 

<220> 

40 <221> sig_peptide 
<222> 73 . . 132 
<223> Von Heijne matrix 

score 5.21332530399231 

seq SPVFLVFPPEITA/SE 

45 

<400> 17 

acaagaaaag aacatggtct agactgaagt accaactaaa tcatctcctt tcaaattatc 60 
accgacacca tc atg gat tea age acc gca cac agt ccg gtg ttt ctg gta 111 
Met Asp Ser Ser Thr Ala His Ser Pro Val Phe Leu Val 
50 -20 -15 -10 

ttt cct cca gaa ate act get tea gaa tat gag tec aca gaa ctt tea 159 
Phe Pro Pro Glu lie Thr Ala Ser Glu Tyr Glu Ser Thr Glu Leu Ser 

-5 15 
gee acg acc ttt tea act caa age ccc ttg caa aaa tta ttt get aga 207 
55 Ala Thr Thr Phe Ser Thr Gin Ser Pro Leu Gin Lys Leu Phe Ala Arg 
10 15 20 25 

aaa atg aaa ate tta ggg act ate cag ate ctg ttt gga att atg acc 255 
Lys Met Lys lie Leu Gly Thr lie Gin lie Leu Phe Gly lie Met Thr 
30 35 40 

60 ttt tct ttt gga gtt ate ttc ctt ttc act ttg tta aaa cca tat cca 303 
Phe Ser Phe Gly Val lie Phe Leu Phe Thr Leu Leu Lys Pro Tyr Pro 

45 50 55 

agg ttt ccc ttt ata ttt ctt tea gga tat cca ttc tgg ggc tct gtt 351 
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10 



15 



20 



25 



30 



Airg 
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Phe 


He 
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Glv 


Tvr 

jC 


Pro 


Phe 


Trn 


Glv 


Ser 


Val 








60 










65 










70 










tta 
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aat 


tct 


aaa 
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ttc 
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att 


gca 


ata 
y "-y 


aaa 


aga 


aaa 


acc 


3 99 


Leu 


Phe 


He 


Asn 
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Glv 


Ala 


Phe 


Leu 


He 


Ala 


Val 


Lvs 

JL ° 


Arg 


Lvs 


Thr 






75 










80 










85 












3.C3. 


gaa 


act 


c tg 


ata 


ata 


tta 


age 


cga 


ata 


atg 


aat 


ttt 


ctt 


aat 


gee 


447 


Thr 


Glu 


Thr 


Leu 


He 


He 


Leu 


Ser 


Arg 


He 


Met 


Asn 


Phe 


Leu 


Ser 


Ala 




90 










95 










100 










105 




c tg 


yyci 


of c a 


ata 


act 


aaa 
yy a 


ate 


att 


etc 


etc 


aca 


ttt 


aat 
yy *- 


ttc 


ate 


eta 


4 95 


Leu 


Glv 


Ala 


He 


Ala 


Glv 


He 


He 


Leu 


Leu 


Thr 


Phe 


Glv 

JL 


Phe 


He 


Leu 












110 










115 










120 








c aa 


aac 


tac 


att 


tgt 


aat 

yy L - 


tat 


tct 


cac 


caa 


aat 


agt 


cag 


tat 
uy u 


aag 


543 


Asp 


Gin 


Asn 




He 


Cys 


Glv 


Tvr 
j 


Ser 


His 


Gin 


Asn 


Ser 


Gin 


Cys 


Lys 










125 










13 0 










135 








act' 


att 

y u u 


act 


y *— *— 


eta 


ttc 


tta 

L - '-y 


aaa 


att 


tta 
L -y 


att 


aca 


tta 
L - L -y 


atg 


act 


ttc 


591 


Ala 


Val 


Thr 


Val 


Leu 


Phe 


Leu 


Glv 

JL 


He 


Leu 


He 


Thr 


Leu 


Met 


Thr 


Phe 








140 










145 










150 










age 


att 


att 


gaa 


tta 


ttc 


att 


tct 


ctg 


cct 


ttc 


tea 


att 


ttg 


ggg 


tgc 


639 


Ser 


He 


He 


Glu 


Leu 


Phe 


He 


Ser 


Leu 


Pro 


Phe 


Ser 


He 


Leu 


Gly 


Cys 






155 










160 










165 












cac 


tea 


gag 


gat 


tgt 


gat 


tgt 


gaa 


caa 


tgt 


tgt 


tgactagcac tgtgagaata 


692 


His 


Ser 


Glu 


Asp 


Cys 


Asp 


Cys 


Glu 


Gin 


Cys 


Cys 















170 175 180 

aagatgtgtt aaaatattaa aaaaaaaaaa aaaaaaaaag aaaaaaaaaa aaaaa 747 

<210> 18 

<211> 1884 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 94 . . 1275 

35 <220> 

<221> sig_peptide 

<222> 94 . .210 

<223> Von Heijne matrix 

score 4.55778392992629 
40 seq LVLVKRLLAVSVS/CI 

<400> 18 

acagegegtg cagcctcgtg cagctcttct ggtctccggc gcccgcccct cagaegtaat 60 
gttgaattaa agaaaatact ttatcagaag aag atg gec act gec cag ttg cag 114 
45 Met Ala Thr Ala Gin Leu Gin 

-35 

agg act ccc atg agt gca ctg gta ttt ccc aat aag ata tea act gaa 162 
Arg Thr Pro Met Ser Ala Leu Val Phe Pro Asn Lys He Ser Thr Glu 
-30 -25 -20 

50 cac cag tct ttg gtg tta gtg aag agg ctt eta gca gtt tea gta tec 210 
His Gin Ser Leu Val Leu Val Lys Arg Leu Leu Ala Val Ser Val Ser 

-15 -10 -5 

tgt ate acg tat ttg agg gga ata ttc cca gaa tgc get tat gga aca 258 
Cys He Thr Tyr Leu Arg Gly He Phe Pro Glu Cys Ala Tyr Gly Thr 
55 1 5 10 15 

aga tat eta gat gat ctt tgt gtc aaa ata ctg aga gaa gat aaa aat 3 06 

Arg Tyr Leu Asp Asp Leu Cys Val Lys He Leu Arg Glu Asp Lys Asn 

20 25 30 

tgc cca gga tct aca cag tta gtg aaa tgg att eta gga tgt tat gat 354 
60 Cys Pro Gly Ser Thr Gin Leu Val Lys Trp He Leu Gly Cys Tyr Asp 
35 40 45 

get tta cag aaa aaa tat eta agg atg gtt gtt eta get gta tac aca 402 
Ala Leu Gin Lys Lys Tyr Leu Arg Met Val Val Leu Ala Val Tyr Thr 

22 



WO 01/42451 



PCT/IB00/01938 



50 55 60 



10 



15 



20 gga gtt ata ttt gaa ggg gaa cct atg tat tta aat gtg gga gaa gtc 73 8 



25 



30 



40 



45 



aac 


cca 


gaa 


gat 


cct 


cag 


aca 


att 
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gaa 


tat 

uy u 


tac 


caa 


ttc 


aaa 


ttc 


Asn 


Pro 


Glu 


Asp 


Pro 


Gin 


Thr 


lie 


Ser 


Glu 


Cy s 


Tvr 
xy 1 


Gin 


Phe 


Lys 


Phe 


65 










7 0 










75 
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Pro 
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Met 
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He 


Ser 


Lys 


Asn 


Gin 










85 










90 










95 




ay u 


aac 


gaa 


tct 


age 


atg 


tta 

u uy 


tct 


act 


gac 
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aag 


aaa 


gca 


age 


att 


Ser 


Asn 


Glu 


Ser 
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Leu 
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Thr 


Asp 
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Lys 


Ala 
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He 
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1 \j \j 
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35 cag gaa aaa aac cct gca tct tct gaa ctt gaa gaa cca agt tta gtt 978 



1026 



1074 



1122 



1170 



50 aga agt caa cat gaa tct ggg aga ata gtc etc cat cac ttt gat tct 1218 

12 66 

Pro Lys 

55 340 345 350 

gaa cat ata taaaaattat ttttgttctg caggcttgea gagttcttct 1315 
Glu His He 
355 

caccatttaa actgaaggac cctatattat atttccctaa ctctgaagat gtatatgtag 1375 

60 tttaaagcag tttatacact aaaactaagt ttttggctga ctgtcatatt gtggtcctta 1435 

atcttgagat aaatccaata gaacttttga ataaaagcaa aagtacaaat gtcataattg 1495 

atteggtaat aagtaaaatt tcaaaattga ttttgttcat tacctactta atatttcctt 1555 

taaatatata ctaactgtta aggccctcta atgccatttt tctaaacagt aatgtttact 1615 
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ttggtattaa aatttggtat tgattcactt tttacttatg ttaaaattat accatttaac 1675 

tggctctttt gtcattgtgc tgttattaaa acaatgttct tcaatatttt gacataatgt 1735 

attaacattt taatatataa tgtacaattt aagaattggt gctttacctt tactatgctt 1795 

tttttacagg acaaaaagac tgatttttaa agtatggcat tttttgcagc ataaataaaa 1855 

5 tattgttcag tacgaaaaaa aaaaaaaaa 1884 

<210> 19 

<211> 691 

<212> DNA 

10 <213> Homo sapiens 

<220> 

<221> CDS 

<222> 42 . .515 

15 

<220> 

<221> sig_peptide 

<222> 42 . . 92 

<22 3> Von Heijne matrix 

20 score 10.7019149919754 
seq VLMLLAVLIWTGA/EN 

<400> 19 

gagttgtcct gtgctggagg tctgctcaga cgaaggtctc c atg gcg tta gaa gtc 56 
25 Met Ala Leu Glu Val 

-15 

ttg atg etc etc get gtc ttg att tgg acc ggt get gag aac etc cat 104 

Leu Met Leu Leu Ala Val Leu lie Trp Thr Gly Ala Glu Asn Leu His 
-10 -5 1 

30 gtg aaa ata agt tgc tct ctg gac tgg ttg atg gtc tea gtt ate cca 152 

Val Lys lie Ser Cys Ser Leu Asp Trp Leu Met Val Ser Val lie Pro 
5 10 15 20 

gtt gca gaa age aga aat ctg tat ata ttt gcg gat gaa tta cat ctg 200 

Val Ala Glu Ser Arg Asn Leu Tyr lie Phe Ala Asp Glu Leu His Leu 
35 25 30 35 

gga atg ggc tgc cct gca aat egg ata cat aca tat gta tat gag ttt 248 

Gly Met Gly Cys Pro Ala Asn Arg lie His Thr Tyr Val Tyr Glu Phe 

40 45 50 

ata tat ctt gtt cgt gat tgt ggc ate agg aca agg gta gtt tct gag 2 96 

40 lie Tyr Leu Val Arg Asp Cys Gly lie Arg Thr Arg Val Val Ser Glu 
55 60 65 

gaa act etc ctt ttt caa acc gag ctg tac ttt acc cca agg aat ata 344 

Glu Thr Leu Leu Phe Gin Thr Glu Leu Tyr Phe Thr Pro Arg Asn lie 

70 75 80 

45 gat cat gac cct cag gaa ate cat ttg gag tgt tec acc tct agg aaa 392 

Asp His Asp Pro Gin Glu lie His Leu Glu Cys Ser Thr Ser Arg Lys 
85 90 95 100 

tea gtg tgg ctt aca cca gtt tct act gag aat gaa ata aaa ttg gat 440 

Ser Val Trp Leu Thr Pro Val Ser Thr Glu Asn Glu lie Lys Leu Asp 
50 105 110 115 

cct agt cct ttt att get gac ttt cag aca aca gca gaa gag tta gga 488 

Pro Ser Pro Phe lie Ala Asp Phe Gin Thr Thr Ala Glu Glu Leu Gly 

120 125 130 

tta tta tct tct agt cca aac ttg etc tgagctaaag gagaaatgga 535 

55 Leu Leu Ser Ser Ser Pro Asn Leu Leu 
135 140 
aacttgaagc tggtgttatg tattttgeag gaaaacagtt tcattttttc atagcaaaaa 595 
tatagttggt gtatatctct ccttaagtct ctggtttcta aaaaccctac ttcagtaaag 655 
gtcctgatta gttgattagc gaaaaaaaaa aaaaaa 691 



60 



<210> 20 
<211> 1138 
<212> DNA 



24 



WO 01/42451 PCT/IB00/01938 

<213> Homo sapiens 

<220> 
<221> CDS 
5 <222> 271. .969 



<220> 

<221> sig_peptide 

<222> 271. .366 
10 <223> Von Heijne matrix 

score 5.6680378526706 
seq WMGLACFRSLAAS/SP 



<220> 

15 <221> misc_feature 
<222> 989 

<223> n=a, g, c or t 



<400> 20 

20 aaaaaccttt caagtgcccc ctcctttcct taaagtcttt tataggggtc cccttcttgg 60 

ccatctccat cctgtgagtc aggactgaaa gggcacagac aggtcactgc cagcattgtt 120 

ggggcaagcc tgcaagcacg catcactggg gatctgacat gacaatggcc gcctgccccc 180 

tctgagggct acaggactta ccccagtggg aagcagctaa gcaggtctga ccagccgacc 240 

tggacctggc caagggtcct gtcatccctc atg gcc acc ccg cca ttc egg ctg 294 

25 Met Ala Thr Pro Pro Phe Arg Leu 

-30 -25 
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atg 
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aag 


ctg 
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gag 


gaa 


aag 


get 


ttt 


cgc 


gaa 


gag 


atg 


aaa 


438 




Leu 


Met 


His 


Lys 


Leu 


Gin 


Glu 


Glu 


Lys 


Ala 


Phe 


Arg 


Glu 


Glu 


Met 


Lys 




35 




10 










15 










20 














att 


ttt 


cgt 


gaa 


aaa 


ata 


gag 


gac 


ttc 


agg 


gaa 


gag 


atg 


tgg 


act 


ttc 


486 




He 


Phe 


Arg 


Glu 


Lys 


He 


Glu 


Asp 


Phe 


Arg 


Glu 


Glu 


Met 


Trp 


Thr 


Phe 






25 










30 










35 










40 






cga 


ggc 


aag 


ate 


cat 


get 


ttc 


egg 


ggc 


cag 


ate 


ctg 


ggt 


ttt 


tgg 


gaa 


534 


40 


Arg 


Gly 


Lys 


He 


His 


Ala 


Phe 


Arg 


Gly 


Gin 


He 


Leu 


Gly 


Phe 


Trp 


Glu 














45 










50 










55 








gag 


gag 


aga 


cct 


ttc 


tgg 


gaa 


gag 


gag 


aaa 


acc 


ttc 


tgg 


aaa 


gag 


gaa 


582 




Glu 


Glu 


Arg 


Pro 


Phe 


Trp 


Glu 


Glu 


Glu 


Lys 


Thr 


Phe 


Trp 


Lys 


Glu 


Glu 












60 










65 










70 








45 


aaa 


tec 


ttc 


tgg 


gaa 


atg 


gaa 


aag 


tct 


ttc 


agg 


gag 


gaa 


gag 


aaa 


act 


630 




Lys 


Ser 


Phe 


Trp 


Glu 


Met 


Glu 


Lys 


Ser 


Phe 


Arg 


Glu 


Glu 


Glu 


Lys 


Thr 










75 










80 










85 












ttc 


tgg 


aaa 


aag 


tac 


cgc 


act 


ttc 


tgg 


aag 


gag 


gat 


aag 


gcc 


ttc 


tgg 


678 




Phe 


Trp 


Lys 


Lys 


Tyr 


Arg 


Thr 


Phe 


Trp 


Lys 


Glu 


Asp 


Lys 


Ala 


Phe 


Trp 




50 




90 










95 










100 
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tgg 


gaa 
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gac 


egg 


aac 


ctt 


ctt 


cag 


gag 


726 




Lys 


Glu 


Asp 


Asn 


Ala 


Leu 


Trp 


Glu 


Arg 


Asp 


Arg 


Asn 


Leu 


Leu 


Gin 


Glu 






105 










110 










115 










120 






gac 


aag 


gcc 


ctg 


tgg 


gag 


gaa 


gaa 


aag 


gcc 


ctg 


tgg 


gta 


gag 


gaa 


aga 


774 


55 


Asp 


Lys 


Ala 


Leu 


Trp 


Glu 


Glu 


Glu 


Lys 


Ala 


Leu 


Trp 


Val 


Glu 


Glu 


Arg 














125 










130 










135 








gcc 


etc 


ctt 


gag 


ggg 


gag 


aaa 


gcc 


ctg 


tgg 


gaa 


gat 


aaa 


acg 


tec 


etc 


822 




Ala 


Leu 


Leu 


Glu 


Gly 


Glu 


Lys 


Ala 


Leu 


Trp 


Glu 


Asp 


Lys 


Thr 


Ser 


Leu 












140 










145 










150 








60 


tgg 


gag 


gaa 


gag 


aat 
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etc 


tgg 
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gag 


agg 


gcc 
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tgg 


atg 


870 




Trp 


Glu 


Glu 


Glu 


Asn 
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Leu 


Trp 


Glu 


Glu 


Glu 


Arg 


Ala 


Phe 


Trp 


Met 










155 










160 










165 












gag 


aac 


aat 


ggc 


cac 


att 


gcc 


gga 


gag 


cag 


atg 


etc 


gaa 


gat 


ggg 


ccc 


918 



25 
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10 



15 



Glu Asn Asn Gly His lie Ala Gly Glu Gin Met Leu Glu Asp Gly Pro 

170 175 180 

cac aac gcc aac aga ggg cag cgc ttg ctg gcc ttc tec cga ggc agg 
His Asn Ala Asn Arg Gly Gin Arg Leu Leu Ala Phe Ser Arg Gly Arg 
185 190 195 200 

gcg tagecagcat geaggtgean gggccctgtg gtccagactc ccctgggttg 
Ala 

ggattcaagt ccagggtgag cccatgtgct ggagaaaata cacactcatt ggtctccttg 
ctttgaaaga tccaataaag tcctgaggca aggtttggaa aaccaaaaaa aaaaaaaaa 

<210> 21 
<211> 468 
<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 76 . . 



966 



1019 

1079 
1138 



276 



20 <220> 

<221> sig_peptide 

<222> 76 . . 135 

<223> Von Heijne matrix 

score 5.21332530399231 
25 seq SPVFLVFPPEITA/SE 



<400> 21 

agcacaagaa aagaacatgg tctagactga agtaccaact aaatcatctc ctttcaaatt 
atcaccgaca ccatc atg gat tea age ace gca cac agt ccg gtg ttt ctg 
30 Met Asp Ser Ser Thr Ala His Ser Pro Val Phe Leu 

-20 -15 -10 

gta ttt cct cca gaa ate act get tea gaa tat gag tec aca gaa ctt 
Val Phe Pro Pro Glu lie Thr Ala Ser Glu Tyr Glu Ser Thr Glu Leu 
-5 15 
35 tea gcc acg acc ttt tea act caa age ccc ttg caa aaa tta ttt get 
Ser Ala Thr Thr Phe Ser Thr Gin Ser Pro Leu Gin Lys Leu Phe Ala 

10 15 20 

aga aaa atg aaa ate tta ggg gat ate cat tct ggg get ctg ttt tgt 
Arg Lys Met Lys lie Leu Gly Asp lie His Ser Gly Ala Leu Phe Cys 
40 25 30 - 35 40 

tea tta att ctg gag cct tec taattgcagt gaaaagaaaa accacagaaa 
Ser Leu lie Leu Glu Pro Ser 
45 

ctctgggaat tttgattaca ttgatgactt tcagcattat tgaattattc atttctctgc 
45 ctttctcaat tttggggtgc cactcagagg attgtgattg tgaacaatgt tgttgactag 
cactgtgaga ataaagatgt gttaaaataa aaaaaaaaaa aa 



60 
111 



159 



207 



255 



306 



366 
426 
468 



<210> 22 

<211> 720 

50 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
55 <222> 6. .287 



<220> 

<221> sig_peptide 

<222> 6. .80 
60 <223> Von Heijne matrix 

score 4.17710408129886 
seq ISLSHLFLDLSRS/LW 



26 
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10 



15 



20 



25 



<400> 22 

atttg atg tgc ttc tta gtc teg ttt aac ttg ccg att cat ata tec ctg 
Met Cys Phe Leu Val Ser Phe Asn Leu Pro lie His lie Ser Leu 
-25 -20 -15 

tct cat ttg ttc tta gat ttg tea cga age etc tgg ttt ttg get tgt 
Ser His Leu Phe Leu Asp Leu Ser Arg Ser Leu Trp Phe Leu Ala Cys 
-10 -5 15 

cct ggt ttg aac ttg gtg tat ctg get ctt gac tea ttt tct gac etc 
Pro Gly Leu Asn Leu Val Tyr Leu Ala Leu Asp Ser Phe Ser Asp Leu 

15 



10 



20 
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cca 


tec 


tta 


aat 


ctg 


ctt 


ttc 
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ttt 


gta 
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ggc 


ttt 


ggc 


gtc 
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Pro 


Ser 


Leu 


Asn 


Leu 


Leu 


Phe 


Tyr 


Phe 
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Pro 


Gly 


Phe 


Gly 


Val 






25 










30 










35 








tec 


aag 


tac 


ctg 


ace 


tea 


get 


caa 


cct 


gtc 


ttg 


ggt 


ttt 


ctt 


etc 


etc 


Ser 


Lys 


Tyr 


Leu 


Thr 


Ser 


Ala 


Gin 


Pro 


Val 


Leu 


Gly 


Phe 


Leu 


Leu 


Leu 




40 










45 










50 
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gac 


att 


gac 


aac 
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gee 


etc 


eta 


ggc 


aca 


gag 


aga 


tgg 


age 




Pro 


Asp 


He 


Asp 


Asn 


Pro 


Ala 


Leu 


Leu 


Gly 


Thr 


Glu 


Arg 


Trp 


Ser 




55 










60 










65 













tgagtgtggt tttcctgaaa taaagcttgc attatgagag ggaataaaca gaagaaaaaa 
atagtaagta aaatcttget tgcctctcag taaaataaag ctctattttt cgtttttttt 
ttttccaact tcctgtacaa aaaagggaaa actttagctt ttgggggaaa tttggagcta 
gcctgttggt actgttgagc ttagtgtatc tataactata tattattcca caatatctta 
aatactttat aaagatattt tcataaatta cagcaatcct ggctttagat gattgatggc 
catttttaaa caattaaagc taatttctag ctttttatga gtttggtatt aagcacagta 
gtttcttaga aagtctccag ggaatgeatt ttgeaaaata aaaatcagct aatgacccaa 
aaaaaaaaaa aaa 



50 



98 



146 



194 



242 



287 



347 
407 
467 
527 
587 
647 
707 
720 



<210> 23 

30 <211> 727 

<212> DNA 

<213> Homo sapiens 

<220> 
35 <221> CDS 

<222> 171 . . 692 

<220> 

<221> sig_peptide 
40 <222> 171 . . 227 

<22 3> Von Heijne matrix 

score 4.17573075349936 
seq LLLGQRCSLKVSG/QE 



45 <400> 23 

attgtgacat caccgtgcac tagecaatgg ctgcctgcct 

tgggactact agecctttgt tgatagggag aagecaacat 

cttcagggca gctcccagag catggatccc tcctgattcc 



50 etc aca gtc 

Leu Thr Val 
-15 

ggg caa gag 

Gly Gin Glu 
55 1 

aa 9 9 fc 9 cct 

Lys Val Pro 

gag gat gac 

60 Glu Asp Asp 

ate aat gtc 

He Asn Val 



aag ctg etc 
Lys Leu Leu 



agt gta 
Ser Val 

gag gag 
Glu Glu 

20 
aag cac 
Lys His 
35 

ate atg 
He Met 



gee 
Ala 
5 

cag 
Gin 

etc 
Leu 

cag 
Gin 



ctg ggc cag aga 
Leu Gly Gin Arg 
-10 

acg ctg aag aga 
Thr Leu Lys Arg 



cag cac 
Gin His 

tct gac 
Ser Asp 

ccc ttg 
Pro Leu 



ctg 
Leu 

tac 

Tyr 

40 

gag 

Glu 



ctt 

Leu 

25 

tgc 

Cys 

aag 
Lys 



tgc 
Cys 

ctg 

Leu 

10 

ttc 

Phe 

att 
He 

atg 
Met 



aagctgggtc cctggtctcc 
ctcccgcagg accccctaat 
actcagcccg atg ttc 
Met Phe 
agt ctg aag gtg tea 
Ser Leu Lys Val Ser 
-5 

gtg tec agg egg ctg 
Val Ser Arg Arg Leu 
15 

c 9t ggc cag etc ctg 
Arg Gly Gin Leu Leu 
30 

ggg ccc aat gec tct 
Gly Pro Asn Ala Ser 
45 

gcg eta aag gag gec 
Ala Leu Lys Glu Ala 



60 
120 
176 

224 



272 



320 



368 



416 



27 
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50 

cac cag ccg cag acc cag ccc 
His Gin Pro Gin Thr Gin Pro 



10 



15 



20 



65 



70 



5 get aaa cac ttt gaa cca cag 
Ala Lys His Phe Glu Pro Gin 



80 



85 



agg cag gag cac gag gag cgc 
Arg Gin Glu His Glu Glu Arg 
100 

gag cag ctg gec cag tac etc 
Glu Gin Leu Ala Gin Tyr Leu 
115 

get gga gag agg gag ctt gag 
Ala Gly Glu Arg Glu Leu Glu 
130 

gac atg gag gag aag gag gag 
Asp Met Glu Glu Lys Glu Glu 
145 150 
atcctacccg aaaaaaaaaa aaaaa 



55 

ctg 

Leu 

gat 
Asp 

ctg 
Leu 

ctg 
Leu 

gcg 
Ala 
135 
gca 
Ala 



tgg 
Trp 

gec 
Ala 

cag 
Gin 

gca 
Ala 
120 
aag 
Lys 

gca 
Ala 



cac 
His 

aag 
Lys 

aag 
Lys 
105 
gag 
Glu 

gca 
Ala 

get 
Ala 



cag 
Gin 

gec 

Ala 

90 

ata 

He 

gag 
Glu 

egg 
Arg 

gat 
Asp 



60 

ctg gga ctg gtc 
Leu Gly Leu Val 
75 

gtg ctg cag ctg 
Val Leu Gin Leu 



age 
Ser 

cct 
Pro 

cct 
Pro 

cag 
Gin 
155 



ctg gag cac 
Leu Glu His 
110 

cac gtg gag 
His Val Glu 
125 

cag age tec 
Gin Ser Ser 
140 

taaaegggee 



eta 
Leu 

eta 

Leu 

95 

ctg 

Leu 

cca 
Pro 

tgt 
Cys 



464 



512 



560 



608 



656 



702 



727 



<210> 24 
<211> 470 
<212> DNA 
25 <213> Homo sapiens 



30 



35 



<220> 
<221> CDS 
<222> 137. 



,454 



<220> 

<221> sig_peptide 

<222> 137 . . 187 

<223> Von Heijne matrix 

score 10.7019149919754 
seq VLMLLAVLIWTGA/EN 



40 



45 



50 



55 



60 



<400> 24 

atcctgtgaa ctacccaaaa ggaggaaaac gaaegcaget gagcatggga tgccatataa 
aaatcactta aaccagtcgc cactccttgt ttcctgagtt gtcctgtgct ggaggtctgc 
tcagacgaag gtctcc atg gcg tta gaa gtc ttg atg etc etc get gtc ttg 

Met Ala Leu Glu Val Leu Met Leu Leu Ala Val Leu 



att tgg acc ggt get 
He Trp Thr Gly Ala 
-5 

gac tgg 
Asp Trp 



ata 
He 

ata 



ttg atg gtc 
Leu Met Val 
15 

ttt gcg gat 
Phe Ala Asp 
30 

cat aca tat 



gag 
Glu Asn 
1 

tea gtt 
Ser Val 



-15 
aac etc 



cat 
Leu His 



gaa 
Glu 



He His Thr Tyr Val Tyr Glu Phe He Tyr Leu Val Arg Asp Cys 



tat 
Tyr 

egg 
Arg 

45 

ggc ate agg aca agg gta 
Gly He Arg Thr Arg Val 
60 65 
atg ttt tgt cag act ttt 
Met Phe Cys Gin Thr Phe 
80 

taaaaaaaaa aaaaaa 



tta 
Leu 



gta tat 



ate 
He 

cat 
His 
35 
gag 



cca 
Pro 
20 
ctg 



gtg aaa 
Val Lys 
5 

gtt gca gaa 
Val Ala Glu 

gga atg ggc 



-10 
ata agt 



tgc tct ctg 

He Ser Cys Ser Leu 
10 

age aga aat ctg 

Ser Arg Asn Leu 
25 

tgc cct gca aat 



Leu Gly Met Gly Cys Pro Ala Asn 
40 

ttt ata tat ctt gtt cgt gat tgt 



50 

aga aca gtg att gtc 
Arg Thr Val He Val 
70 

atg cct agt att aaa 
Met Pro Ser He Lys 
85 



55 

tgt aaa aaa tac tgc 
Cys Lys Lys Tyr Cys 
75 

att gtc ttt 
He Val Phe 



60 
120 
172 



220 



268 



316 



364 



412 



454 



470 



28 
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<210> 25 
<211> 987 
<212> DNA 

<213> Homo sapiens 

5 

<220> 
<221> CDS 
<222> 238 . . 609 

10 <220> 

<221> sig_peptide 

<222> 238 . .291 

<223> Von Heijne matrix 

score 10.0374888212272 
15 seq LLLLVMALPPGTT/GV 

<400> 25 

attccattca cagactcttg ttgggcagca gccacccgct cacctccatc cccaggactt 60 
agagggacgc agggcgttgg gaacagagga cactccaggc gctgaccctg ggaggccagg 12 0 

20 accagggcca aagtcccgtg ggcaagagga gtcctcagag gtccttcatt cagcggttcc 180 
gggaggtctg ggaagcccac ggcctggctg gggcagggtc aacgccgcca ggccgcc 23 7 

atg gtc ctg tgc tgg ctg ctg ctt ctg gtg atg get ctg ccc cca ggc 285 
Met Val Leu Cys Trp Leu Leu Leu Leu Val Met Ala Leu Pro Pro Gly 
-15 -10 -5 

25 acg acg ggc gtc aag gac tgc gtc ttc tgt gag etc acc gac tec atg 333 
Thr Thr Gly Val Lys Asp Cys Val Phe Cys Glu Leu Thr Asp Ser Met 

15 10 
ca 9 tgt cct ggt acc tac atg cac tgt ggc gat gac gag gac tgc ttc 381 
Gin Cys Pro Gly Thr Tyr Met His Cys Gly Asp Asp Glu Asp Cys Phe 

30 15 20 25 30 

aca ggc cac ggg gtc gec ccg ggc act ggt ccg gtc ate aac aaa ggc 429 
Thr Gly His Gly Val Ala Pro Gly Thr Gly Pro Val lie Asn Lys Gly 

35 40 45 

tgc ctg cga gec acc age tgc ggc ctt gag gaa ccc gtc age tac agg 477 

35 Cys Leu Arg Ala Thr Ser Cys Gly Leu Glu Glu Pro Val Ser Tyr Arg 
50 55 60 

ggc gtc acc tac age etc acc acc aac tgc tgc acc ggc cgc ctg tgt 525 
Gly Val Thr Tyr Ser Leu Thr Thr Asn Cys Cys Thr Gly Arg Leu Cys 
65 70 75 

40 aac aga gec ccg age age cag aca gtg ggg gec acc acc age ctg gca 573 
Asn Arg Ala Pro Ser Ser Gin Thr Val Gly Ala Thr Thr Ser Leu Ala 

80 85 90 

ctg ggg ctg ggt atg ctg ctt cct cca cgt ttg ctg tgaccaacag 619 
Leu Gly Leu Gly Met Leu Leu Pro Pro Arg Leu Leu 

45 95 100 105 

ggaggacagg gectgggact gttctcccag atccgccact ccccatgtcc ccatgtcctt 679 
cccccactaa atggccagag aggccctgga caacctcttg cggccctggc ttcatccctt 739 
ctaaggctgt ccaccaggag cccggtgcta ggggaagcat ccccaggcct gaetgagegg 7 99 
caggggagca cggcccgtgg gtttgattgt attactctgt tccactggtt etaagacgea 859 

50 gagcttctca catctcaatc aggatgette tctccattgg tagcacttta gagtccatga 919 
aatatggtaa aaaatatata tatatcataa taaatgacag ctgatgttca tggaaaaaaa 97 9 
aaaaaaaa 987 

<210> 26 

55 <211> 908 

<212> DNA 

<213> Homo sapiens 

<220> 
60 <221> CDS 

<222> 80 . . 862 



<220> 



29 



WO 01/42451 



PCT/IB00/01938 



<221> sig_peptide 

<222> 80 . . 127 

<223> Von Heijne matrix 

score 3.66725851505537 
5 seq FSLLSISGPPISS/SA 

<400> 26 

gaatgtttat cctctggaca aaccagccag cctctccaga gcaggcgtgt gatctctgta 60 

cccccgcagt ggtcagaat atg gag aac ttc tea etc etc age ate tct gga 112 

10 Met Glu Asn Phe Ser Leu Leu Ser lie Ser Gly 

-15 -10 

cct cca ate tct tec tec gee ctg agt get ttt ccc gac att atg ttc 160 

Pro Pro lie Ser Ser Ser Ala Leu Ser Ala Phe Pro Asp lie Met Phe 

-5 15 10 

15 tct cgt gec acc age ctg cca gac att gca aag aca gca gta ccc act 208 

Ser Arg Ala Thr Ser Leu Pro Asp lie Ala Lys Thr Ala Val Pro Thr 

15 20 25 

gag gca tec age cca get cag gee ctg cca ccc cag tac caa age ate 256 

Glu Ala Ser Ser Pro Ala Gin Ala Leu Pro Pro Gin Tyr Gin Ser lie 

20 3 0 3 5 4 0 

att gtc agg caa ggg ata cag aac aca gtg etc tea cca gac tgc age 304 

lie Val Arg Gin Gly lie Gin Asn Thr Val Leu Ser Pro Asp Cys Ser 

45 50 55 

ttg ggg gac acc cag cac gga gag aag ctg agg egg aac tgc act ate 3 52 

25 Leu Gly Asp Thr Gin His Gly Glu Lys Leu Arg Arg Asn Cys Thr lie 

60 65 70 75 

tac egg ccc tgg ttc tec ccc tac age tac ttc gtg tgt gca gac aaa 400 

Tyr Arg Pro Trp Phe Ser Pro Tyr Ser Tyr Phe Val Cys Ala Asp Lys 

80 85 90 

30 gag age cag ctg gag gee tat gac ttc cca gag gtg cag cag gat gag 44 8 

Glu Ser Gin Leu Glu Ala Tyr Asp Phe Pro Glu Val Gin Gin Asp Glu 

95 100 105 

ggc aag tgg gac aac tgc ctt tct gag gac atg get gag aac ate tgt 496 

Gly Lys Trp Asp Asn Cys Leu Ser Glu Asp Met Ala Glu Asn lie Cys 

35 110 115 120 

teg tec tct tec tec cca gag aac act tgc cct cga gaa gee acc aag 544 

Ser Ser Ser Ser Ser Pro Glu Asn Thr Cys Pro Arg Glu Ala Thr Lys 

125 130 135 

aaa tec agg cat ggc ctg gac tec ate aca tec cag gac ate eta atg 592 

40 Lys Ser Arg His Gly Leu Asp Ser lie Thr Ser Gin Asp lie Leu Met 

140 145 150 155 

get tec aga tgg cac cca gca cag cag aat ggc tac aag tgc gtg gee 64 0 

Ala Ser Arg Trp His Pro Ala Gin Gin Asn Gly Tyr Lys Cys Val Ala 

160 165 170 

45 tgc tgc cgc atg tac ccc acc ctg gac ttc etc aag age cac ate aag 688 

Cys Cys Arg Met Tyr Pro Thr Leu Asp Phe Leu Lys Ser His lie Lys 

175 180 185 

agg ggc ttc agg gag ggc ttc age tgc aag gtg tac tac cgc aag etc 73 6 

Arg Gly Phe Arg Glu Gly Phe Ser Cys Lys Val Tyr Tyr Arg Lys Leu 

50 190 195 200 

aaa gee etc tgg age aag gag cag aag gee egg ctg gga gac agg etc 784 

Lys Ala Leu Trp Ser Lys Glu Gin Lys Ala Arg Leu Gly Asp Arg Leu 

205 210 215 

tec tec ggc age tgc cag gec ttc aat agt cct get gaa cac ctt agg 832 

55 Ser Ser Gly Ser Cys Gin Ala Phe Asn Ser Pro Ala Glu His Leu Arg 

220 225 230 235 

caa att ggc ggt gaa gec tac tta tgt etc tagagagatg ccaataaagt 8 82 
Gin lie Gly Gly Glu Ala Tyr Leu Cys Leu 
240 245 

60 tagtcacagc caaaaaaaaa aaaaaa 908 

<210> 27 
<211> 762 



30 
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<212> DNA 

<213> Homo sapiens 

<220> 
5 <221> CDS 

<222> 83 . .310 

<220> 

<221> sig_peptide 
10 <222> 83 . . 157 

<223> Von Heijne matrix 

score 4.72955689475746 

seq LCALLSNFCPSTT/VK 

15 <400> 27 

ttttttctac tacaaacgcc atggggatgc gggtctggga acagcggaaa accctaccct 60 

gccctgaaaa gtccctggct ca atg tgc atg tec ctt tct atg aaa gtt cct 112 

Met Cys Met Ser Leu Ser Met Lys Val Pro 

-25 -20 

20 tgc tgc eta tgc gec ttg etc tct aac ttc tgt ccc tec aca act gtg 160 

Cys Cys Leu Cys Ala Leu Leu Ser Asn Phe Cys Pro Ser Thr Thr Val 

-15 -10 -5 1 

aaa gga gac gtc gtg act tec ttc ttt cgt get gac tat gac tta gee 208 

Lys Gly Asp Val Val Thr Ser Phe Phe Arg Ala Asp Tyr Asp Leu Ala 

25 5 10 15 

agt agg tct gca gat cag tec tec cag aaa gtg aag ttg cgc atg ttc 256 

Ser Arg Ser Ala Asp Gin Ser Ser Gin Lys Val Lys Leu Arg Met Phe 

20 25 30 

act ggg cgt ctt ccc ate ggc ccc ttc gec agt gtg ggg aac gcg gcg 3 04 

30 Thr Gly Arg Leu Pro lie Gly Pro Phe Ala Ser Val Gly Asn Ala Ala 

35 40 45 

gag ctg tgagccggcg actcgggtcc ctgaggtctg gattctttct ccgctactga 360 
Glu Leu 
50 

35 gaeaeggegg acacacacaa acacagaacc acacagccag tcccaggagc ccagtaatgg 420 

agagccccaa aaagaagaac cagcagctga aagtegggat cctacacctg ggcagcagac 4 80 

agaagaagat caggatacag ctgagatccc agtgcgcgac atggaaggtg atetgeaaga 54 0 

getgeatcag tcaaacaccg gggataaatc tggatttggg ttccggcgtc aaggtgaaga 600 

taatacctaa agaggaacac tgtaaaatgc cagaagcagg tgaagagcaa ccacaagttt 660 

40 aaatgaagac aagctgaaac aacgeaaget ggttttatat tagatatttg acttaaacta 720 

tctcaataaa gttttgeage tttcaccaaa aaaaaaaaaa aa 762 

<210> 28 

<211> 1102 

45 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
50 <222> 310 . . 906 

<220> 

<221> sig_peptide 
<222> 310 . .357 
55 <223> Von Heijne matrix 

score 11.0931109030915 

seq FPLLLLS LGLVLA/ EA 

<400> 28 

60 atacagtgac ctagagcagg catgggtggg tcacaggctt tggagagcac tctctgtcct 60 

gatcttttca gttgagagac ttcagctgtt cattgetcat ttggacttag ttcaaggtca 12 0 

tgtcaaagaa gaaggtgcac ttacgctagt tgttagctct gtcttttgta accatcaagt 180 

tecatgegat tgatcagatt taggaggggg cgttggggga taatcaattt tgggtgtcac 24 0 
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20 



caggtaaaca gagccctcag catctgaata gaaactgaac aggaacagaa gagattcact 
acatctgag atg gag acc ttt cct ctg ctg ctg etc age ctg ggc ctg gtt 
Met Glu Thr Phe Pro Leu Leu Leu Leu Ser Leu Gly Leu Val 
-15 -10 -5 

ctt gca gaa gca tea gaa age aca atg aag ata att aaa gaa gaa ttt 
Leu Ala Glu Ala Ser Glu Ser Thr Met Lys lie lie Lys Glu Glu Phe 

15 10 
aca gac gaa gag atg caa tat gac atg gca aaa agt ggc caa gaa aaa 
Thr Asp Glu Glu Met Gin Tyr Asp Met Ala Lys Ser Gly Gin Glu Lys 



10 15 



20 



25 



30 



cag acc att gag ata tta atg aac ccg ate ctg tta gtt aaa aat acc 
Gin Thr lie Glu lie Leu Met Asn Pro lie Leu Leu Val Lys Asn Thr 
35 



40 



45 



age etc age atg tec aag gat gat atg tct tec aca tta ctg aca ttc 
15 Ser Leu Ser Met Ser Lys Asp Asp Met Ser Ser Thr Leu Leu Thr Phe 



aga agt tta 



50 
cat 



55 



60 



tat aat gac ccc aag gga aac agt teg ggt aat gac 



Arg Ser Leu His Tyr Asn Asp Pro Lys Gly Asn Ser Ser Gly Asn Asp 



65 

aaa gag tgt 



70 75 

tgc aat gac atg aca gtc tgg aga aaa gtt tea gaa gca 

Lys Glu Cys Cys Asn Asp Met Thr Val Trp Arg Lys Val Ser Glu Ala 

80 85 90 

aac gga teg tgc aag tgg age aat aac ttc ate cgc age tec aca gaa 

Asn Gly Ser Cys Lys Trp Ser Asn Asn Phe lie Arg Ser Ser Thr Glu 
25 95 ~ 100 105 110 

gtg atg cgc agg gtc cac agg gec ccc age tgc aag ttt gta cag aat 

Val Met Arg Arg Val His Arg Ala Pro Ser Cys Lys Phe Val Gin Asn 

115 120 125 

cct ggc ata age tgc tgt gag age eta gaa ctg gaa aat aca gtg tgc 

30 Pro Gly lie Ser Cys Cys Glu Ser Leu Glu Leu Glu Asn Thr Val Cys 
130 135 140 

cag ttc act aca ggc aaa caa ttc ccc agg tgc caa tac cat agt gtt 

Gin Phe Thr Thr Gly Lys Gin Phe Pro Arg Cys Gin Tyr His Ser Val 
145 150 155 

35 acc tea tta gag aag ata ttg aca gtg ctg aca ggt cat tct ctg atg 

Thr Ser Leu Glu Lys lie Leu Thr Val Leu Thr Gly His Ser Leu Met 

160 165 170 

age tgg tta gtt tgt ggc tct aag ttg taaatcccac agagctttag 
Ser Trp Leu Val Cys Gly Ser Lys Leu 
40 175 180 

gactagggtc ttactaaaga aggacctctt cttgttcatt cttgtttaaa cctttcctta 

atatctactc tttagcacta tagtgaactc ctgattattt attctaactg gaggagtgaa 

aaatccaaaa ttgtggataa ttcaattaaa agttatgact gaaaaaaaaa aaaaaa 

45 <210> 29 
<211> 436 
<212> DNA 

<213> Homo sapiens 

50 <220> 

<221> CDS 
<222> 24 . .287 



300 
351 



399 



447 



495 



543 



591 



639 



687 



735 



783 



831 



879 



926 



986 
1046 
1102 



<220> 

55 <221> sig_peptide 
<222> 24 . . 131 
<223 > Von Hei jne matrix 

score 3.79790641648006 
seq ILMRDFSPSGIFG/AF 



60 



<400> 29 

acageggaca ccaggactcc aaa atg gcg tea gtt gta cca gtg aag gac aag 53 

Met Ala Ser Val Val Pro Val Lys Asp Lys 
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-35 



10 



15 



20 



25 



aaa 


ctt 


ctq 


qaq 

ZD ZD 


qtc 


aaa 


ctq 


qqq 


qaq 

ZD^ZD 


ctq 

*—ZD 


cca 


aqc 


Lvs 


Leu 


Leu 


Glu 


Val 


Lys 


Leu 


Gly 


Glu 


Leu 


Pro 


Ser 




-25 










-20 










-15 


caq 


cjac 
^ w w 


ttc 


aqt 


cct 


aqt 

ZD w 


qqc 


att 


ttc 


qqa 

ZDZD*-* 


qcq 

ZD^ZD 


ttt 


Arg 


Asp 


Phe 


Ser 


Pro 


Ser 


Glv 


He 


Phe 


Glv 


Ala 


Phe 


-10 










-5 










1 




tac 


cqq 


tac 


tac 


aac 


aaa 


tac 


ate 


aat 


qtq 

ZD ZD 


aaq 


aaq 


Tvr 

JL 


Ara 


Tvr 


Tvr 


Asn 


Lvs 


Tvr 


He 


Asn 


Val 


Lvs 


Lvs 


























ggg 


att 


acc 


atg 


gtg 


ctg 


gca 


tgc 


tac 


gtg 


etc 


ttt 


Gly 


He 


Thr 


Met 


Val 


Leu 


Ala 


Cys 


Tyr 


Val 


Leu 


Phe 






25 










30 










tec 


tac 


aag 


cat 


etc 


aag 


cac 


gag 


egg 


etc 


cgc 


aaa 


Ser 


Tyr 


Lys 


His 


Leu 


Lys 


His 


Glu 


Arg 


Leu 


Arg 


Lys 




40 










45 










50 



-30 

tgg ate ttg atg 



tac 



20 

tac tec 



ttt 



35 



tgaagaggac acactctgca cccccccacc ccacgacctt 
gaacacaatc tcaatcgttg ctgaatcctt tcatatccta 
aaaacatgac tggtaaaaaa aaaaaaaaa 



ggcccgagcc cctccgtgag 
ataggaatta acctccaaat 



<210> 
<211> 
<212> 



30 

1938 
DNA 



<213> Homo sapiens 



<220> 
<221> CDS 
<222> 132. 



101 



149 



197 



245 



287 



347 
407 
436 



1574 



30 <220> 

<221> sig_peptide 

<222> 132 . .206 

<223> Von Heijne matrix 

score 11.1130239236827 
35 seq LALLLTSTPEALG/AN 



<400> 30 

ctccccttcc cgctcccagg aacccatcca gectcaggaa ctgcccccag ccatcgagcc 60 

ttggctactt aagggacctg ggcccaatcc acagctggga cagtcctggc ccactgcact 120 

40 gggaatctag g atg ggg gec ttg gec aga gec ctg ccg tec ata ctg ctg 170 

Met Gly Ala Leu Ala Arg Ala Leu Pro Ser He Leu Leu 

-25 -20 -15 





gca 


ttg 


ctg 


ctt 


acg 


tec 


acc 


cca 


gag 


get 


ctg 


ggt 


gec 


aac 


ccc 


ggc 


218 




Ala 


Leu 


Leu 


Leu 


Thr 


Ser 


Thr 


Pro 


Glu 


Ala 


Leu 


Gly 


Ala 


Asn 


Pro 


Gly 




45 






-10 










-5 










1 












ttg 


gtc 


gee 


agg 


ate 


acc 


gac 


aag 


gga 


ctg 


cag 


tat 


gcg 


gec 


cag 


gag 


266 




Leu 


val 


Ala 


Arg 


He 


Thr 


Asp 


Lys 


Gly 


Leu 


Gin 


Tyr 


Ala 


Ala 


Gin 


Glu 






5 










10 










15 










20 






ggg 


eta 


ttg 


get 


ctg 


cag 


agt 


gag 


ctg 


etc 


agg 


ate 


acg 


ctg 


cct 


gac 


314 


50 


Gly 


Leu 


Leu 


Ala 


Leu 

25 


Gin 


Ser 


Glu 


Leu 


Leu 

30 


Arg 


He 


Thr 


Leu 


Pro 
35 


Asp 






ttc 


acc 


ggg 


gac 


ttg 


agg 


ate 


ccc 


cac 


gtc 


ggc 


cgt 


ggg 


cgc 


tat 


gag 


362 




Phe 


Thr 


Gly 


Asp 

40 


Leu 


Arg 


He 


Pro 


His 
45 


Val 


Gly 


Arg 


Gly 


Arg 
50 


Tyr 


Glu 




55 


ttc 


cac 


age 


ctg 


aac 


ate 


cac 


age 


tgt 


gag 


ctg 


ctt 


cac 


tct 


gcg 


ctg 


410 




Phe 


His 


Ser 
55 


Leu 


Asn 


He 


His 


Ser 
60 


Cys 


Glu 


Leu 


Leu 


His 
65 


Ser 


Ala 


Leu 






agg 


cct 


gtc 


cct 


ggc 


cag 


ggc 


ctg 


agt 


etc 


age 


ate 


tec 


gac 


tec 


tec 


458 




Arg 


Pro 


Val 


Pro 


Gly 


Gin 


Gly 


Leu 


Ser 


Leu 


Ser 


He 


Ser 


Asp 


Ser 


Ser 




60 




70 










75 










80 














ate 


egg 


gtc 


cag 


ggc 


agg 


tgg 


aag 


gtg 


cgc 


aag 


tea 


ttc 


ttc 


aaa 


eta 


506 




He 


Arg 


Val 


Gin 


Gly 


Arg 


Trp 


Lys 


Val 


Arg 


Lys 


Ser 


Phe 


Phe 


Lys 


Leu 






85 










90 










95 










100 
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cag ggc tec ttt gat gtc agt gtc aag ggc ate age att teg gtc aac 554 

Gin Gly Ser Phe Asp Val Ser Val Lys Gly lie Ser lie Ser Val Asn 

105 110 115 

etc ctg ttg ggc age gat tec tec ggg agg ccc aca gtt act gec tec 602 

5 Leu Leu Leu Gly Ser Asp Ser Ser Gly Arg Pro Thr Val Thr Ala Ser 

120 125 130 

age tgc age agt gac ate get gac gtg gag gtg gac atg teg gga gac 650 

Ser Cys Ser Ser Asp lie Ala Asp Val Glu Val Asp Met Ser Gly Asp 

135 140 145 

10 ttg ggg tgg ctg ttg aac etc ttc cac aac cag att gag tec aag ttc 698 

Leu Gly Trp Leu Leu Asn Leu Phe His Asn Gin lie Glu Ser Lys Phe 

150 155 160 

cag aaa gta ctg gag age agg att tgc gaa atg ate cag aaa teg gtg 74 6 

Gin Lys Val Leu Glu Ser Arg lie Cys Glu Met lie Gin Lys Ser Val 

15 165 170 175 180 

tec tec gat eta cag cct tat etc caa act ctg aca gtt aca aca gag 794 

Ser Ser Asp Leu Gin Pro Tyr Leu Gin Thr Leu Thr Val Thr Thr Glu 

185 190 195 

att gac agt ttc gee gac att gat tat age tta gtg gaa gee cct egg 842 

20 lie Asp Ser Phe Ala Asp lie Asp Tyr Ser Leu Val Glu Ala Pro Arg 

200 205 210 

gca aca gee cag atg ctg gag gtg atg ttt aag ggt gaa ate ttt cat 890 

Ala Thr Ala Gin Met Leu Glu Val Met Phe Lys Gly Glu lie Phe His 

215 220 225 

25 cgt aac cac cgt tct cca gtt ace etc ctt get gca gtc atg age ctt 938 

Arg Asn His Arg Ser Pro Val Thr Leu Leu Ala Ala Val Met Ser Leu 

230 235 240 

cct gag gaa cac aac aaa atg gtc tac ttt gec ate teg gat tat gtc 986 

Pro Glu Glu His Asn Lys Met Val Tyr Phe Ala lie Ser Asp Tyr Val 

30 245 250 255 260 

ttc aac acg gec age ctg gtt tat cat gag gaa gga tat ctg aac ttc 1034 

Phe Asn Thr Ala Ser Leu Val Tyr His Glu Glu Gly Tyr Leu Asn Phe 

265 270 275 

tec ate aca gat gac atg ata ccg cct gac tct aat ate cga ctg acc 1082 

35 Ser lie Thr Asp Asp Met lie Pro Pro Asp Ser Asn lie Arg Leu Thr 

280 285 290 

acc aag tec ttc cga ccc ttc gtc cca egg tta gee agg etc tac ccc 1130 

Thr Lys Ser Phe Arg Pro Phe Val Pro Arg Leu Ala Arg Leu Tyr Pro 

295 300 305 

40 aac atg aac ctg gaa etc cag gga tea gtg ccc tct get ccg etc ctg 1178 

Asn Met Asn Leu Glu Leu Gin Gly Ser Val Pro Ser Ala Pro Leu Leu 

310 315 320 

aac ttc age cct ggg aat ctg tct gtg gac ccc tat atg gag ata gat 1226 

Asn Phe Ser Pro Gly Asn Leu Ser Val Asp Pro Tyr Met Glu lie Asp 

45 325 330 335 340 

gec ttt gtg etc ctg ccc age tec age aag gag cct gtc ttc egg etc 1274 

Ala Phe Val Leu Leu Pro Ser Ser Ser Lys Glu Pro Val Phe Arg Leu 

345 350 355 

agt gtg gee act aat gtg tec gec acc ttg acc ttc aat acc age aag 1322 

50 Ser Val Ala Thr Asn Val Ser Ala Thr Leu Thr Phe Asn Thr Ser Lys 

360 365 370 

ate act ggg ttc ctg aag cca gga aag gta aaa gtg gaa ctg aaa gaa 1370 

lie Thr Gly Phe Leu Lys Pro Gly Lys Val Lys Val Glu Leu Lys Glu 

375 380 385 

55 tec aaa gtt gga eta ttc aat gca gag ctg ttg gaa gcg etc etc aac 1418 

Ser Lys Val Gly Leu Phe Asn Ala Glu Leu Leu Glu Ala Leu Leu Asn 

390 395 400 

tat tac ate ctt aac acc ttc tac ccc aag ttc aat gat aag ttg gee 1466 

Tyr Tyr lie Leu Asn Thr Phe Tyr Pro Lys Phe Asn Asp Lys Leu Ala 

60 405 410 415 420 

gaa ggc ttc ccc ctt cct ctg ctg aag cgt gtt cag etc tac gac ctt 1514 

Glu Gly Phe Pro Leu Pro Leu Leu Lys Arg Val Gin Leu Tyr Asp Leu 

425 430 435 
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ggg ctg cag ate cat aag gac ttc ctg ttc ttg ggt gcc aat gtc caa 1562 

Gly Leu Gin lie His Lys Asp Phe Leu Phe Leu Gly Ala Asn Val Gin 

440 445 450 

tac atg aga gtt tgaggacaag aaagatgaag cttggaggtc acagctggat 1614 
5 Tyr Met Arg Val 
455 

ctgcttgttg catttccagc tgtgcagcac gtctcagaga ttcttgaaga atgaagacat 1674 

ttctgctctc agetcegggg gtgaggtgtg cctggcctct gcctccaccc tcctcctctt 1734 

caccaggtgc atgcatgccc tctctgagtc tggactttgc ttcccctcca ggagggacca 1794 

10 ccctccctga ctggcctggg atatctttac aagcaggcac tgtatttttt tattcgecat 1854 

ctgatcccca tgcctagcag agtgctggca cttagtaggt cctcaataaa tatttattaa 1914 

atgatgacaa aaaaaaaaaa aaaa 193 8 

<210> 31 

15 <211> 1116 

<212> DNA 

<213> Homo sapiens 

<220> 
20 <221> CDS 

<222> 117 . . 545 

<220> 

<221> sig_peptide 
25 <222> 117 . .245 

<22 3> Von Heijne matrix 

score 5.65876793443964 

seq WSFALIATLVYA/LF 

30 <400> 31 

ataaggggac gtctagtggg ttgcccggga ggggtggcgg gageggtect ggaaataatc 6 0 

tgtcctctgt cgccgggaac tggcgaggta gttccttcgc ggtggagaga cctgga atg 119 

Met 

gcc aaa tat caa ggt gaa gtt caa agt ttg aaa ctg gat gat gat tea 167 
35 Ala Lys Tyr Gin Gly Glu Val Gin Ser Leu Lys Leu Asp Asp Asp Ser 
-40 -35 -30 

gtt ata gaa gga gta age gac caa gta ctt gtg gca gtt gtg gtc agt 215 
Val lie Glu Gly Val Ser Asp Gin Val Leu Val Ala Val Val Val Ser 
-25 -20 -15 

40 ttc get ttg att get acc ctg gta tat gca ctt ttc aga aat gta cat 263 
Phe Ala Leu lie Ala Thr Leu Val Tyr Ala Leu Phe Arg Asn Val His 
-10 -5 15 

caa aac att cac cca gaa aac cag gag eta gta agg gta ctt cga gaa 311 
Gin Asn lie His Pro Glu Asn Gin Glu Leu Val Arg Val Leu Arg Glu 

45 10 15 20 

cag ctt caa aca gaa cag gat gca cct get gcc act cga cag cag ttc 359 
Gin Leu Gin Thr Glu Gin Asp Ala Pro Ala Ala Thr Arg Gin Gin Phe 

25 30 35 

tac act gac atg tac tgt ccc ate tgc ctg cac caa gcc tec ttc ccg 407 

50 Tyr Thr Asp Met Tyr Cys Pro lie Cys Leu His Gin Ala Ser Phd Pro 
40 45 50 

gtg gag acc aac tgt gga cat ctt ttt tgt ggt gcc tgc att att get 455 
Val Glu Thr Asn Cys Gly His Leu Phe Cys Gly Ala Cys lie lie Ala 
55 60 65 70 

55 tac tgg cga tat ggt tea tgg ctt ggg gca ate agt tgt cca ate tgt 503 
Tyr Trp Arg Tyr Gly Ser Trp Leu Gly Ala lie Ser Cys Pro lie Cys 

75 80 85 

aga caa acg aga cat ggc cac att gca ttg tec aga aca get 545 
Arg Gin Thr Arg His Gly His lie Ala Leu Ser Arg Thr Ala 

60 90 95 100 

tagaccatga cagttagcat cgaagccacc tgaggaggga ggcagtaacc ttactcctaa 605 
cagtatttgg tgaagatgat cagtctcagg atgttctgag attgeatcag gatattaatg 665 
attataaccg gagattctca gggcaaccca gatctgtaag taatgctaaa gcatgttcaa 725 
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cttctctttt gtaaagtgag gtttaccaac aagtattctt 785 

caggcacagt agctaacgcc tataatccta gcactttggg 845 

cttgagccca ggagtttgag accagccttg gaaacatgat 905 

aacaccaaaa aattggacaa gagtgttggc acatgcctgt 965 

ctgaaatggg aggatcacct gagcccagga ggttgaggct 1025 

tattgcactc ccacctgggt ggcagtgaga cccttcctca 1085 

aaaaaaaaaa a 1116 

<210> 32 
10 <211> 1114 
<212> DNA 

<213> Homo sapiens 

<220> 
15 <221> CDS 

<222> 117 . . 362 



agttagagga 
tgactatgag 
aggccaaggc 
gaaaccccat 
5 agtccctgct 
atagtgagcc 
aaaaacaaga 



agacacattt 
aaatcttggc 
aggtggatca 
ctctagaaaa 
tcttgggagg 
atgatcgcac 
aaagaaaaaa 



<400> 32 

ataaggggac gtctagtggg ttgcccggga ggggtggcgg gagcggtcct ggaaataatc 

20 tgtcctctgt cgccgggaac tggcgaggta gttccttcgc ggtggagaga cctgga atg 

Met 
1 

gcc aaa tat caaggt gaa gtt caa agt ttg aaa ctg gat gat gat tea 

Ala Lys Tyr Gin Gly Glu Val Gin Ser Leu Lys Leu Asp Asp Asp Ser 

25 5 10 15 

gtt ata gaa gga gta age gac caa gta ctt gtg gca gtt gtg gtc agt 

Val lie Glu Gly Val Ser Asp Gin Val Leu Val Ala Val Val Val Ser 

20 25 30 

ttc get ttg att get ace ctg gta tat gca ctt ttc aga aat gta cat 

30 Phe Ala Leu lie Ala Thr Leu Val Tyr Ala Leu Phe Arg Asn Val His 

35 40 45 

caa aac att cac cca gaa aac cag gag eta gta agg gta ctt cga gaa 

Gin Asn lie His Pro Glu Asn Gin Glu Leu Val Arg Val Leu Arg Glu 
50 55 60 65 

35 cag ctt caa aca gaa cag gat gca cct get gac teg aca gca gtt eta 

Gin Leu Gin Thr Glu Gin Asp Ala Pro Ala Asp Ser Thr Ala Val Leu 

70 75 80 

cac tgacatgtac tgtcccatct gcctgcacca agcctccttc ccggtggaga 
His 

40 ccaactgtgg acatcttttt tgtggtgcct geattattge ttactggcga tatggttcat 

ggcttggggc aatcagttgt ccaatctgta gacaaacgag acatggccac attgeattgt 

ccagaacagc ttagaccatg acagttagca tcgaagccac ctgaggaggg aggcagtaac 

cttactccta acagtatttg gtgaagatga tcagtctcag gatgttctga gattgeatea 

ggatattaat gattataacc ggagattctc agggcaaccc agatctgtaa gtaatgctaa 

45 agcatgttca aagttagagg aagacacatt tcttctcttt tgtaaagtga ggtttaccaa 

caagtattct ttgactatga gaaatcttgg ccaggcacag tagctaaege ctataatcct 

agcactttgg gaggecaagg caggtggatc acttgagccc aggagtttga gaccagcctt 

ggaaacatga tgaaacccca tctctagaaa aaacaccaaa aaattggaca agagtgttgg 

cacatgcctg tagtccctgc ttcttgggag gctgaaatgg gaggatcacc tgageccagg 

50 aggttgaggc tatagtgagc catgatcgea etattgeact cccacctggg tggcagtgag 

acccttcctc aaaaaacaag aaaagaaaaa aaaaaaaaaa aa 



60 
119 



167 



215 



263 



311 



359 



412 

472 
532 
592 
652 
712 
772 
832 
892 
952 
1012 
1072 
1114 



<210> 33 

<211> 2072 

55 <212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
60 <222> 144. 



, 1262 



<220> 

<221> sig_peptide 
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<222> 144 . . 224 

<223> Von Heijne matrix 

score 5.14258625256317 

seq FLCQRLVLSTLSG/RP 

5 

<400> 33 

acgtggacgc gtctgggctg ctggaggcag cccgagccgc cgccgtcggt gtcgccgcca 60 
ccaccaccat cggagtcacg agtcccgcgt ctgtccgaag tcgccgctct cgggctgctc 120 
acgtctcttc ggagagcgcg cac atg gcg act cag gcg cac tec etc age tac 173 
10 Met Ala Thr Gin Ala His Ser Leu Ser Tyr 

-25 -20 

221 

15 ggg cgc ccc gtc aaa ate cga aag att egg gee aga gac gac aac ccg 269 

317 



20 



25 



30 cca ttt atg aag cac ccg tta aaa ata gtt eta cga gga gtg acc aat 509 



35 



40 



50 



55 



g c a 


999 


+~ etc 

ugu 


aac 


ttc 


tta 
u uy 


uyu 


caa 


cgt 


ctg 


gt c 


c t~ a 
<~ uy 


tct 


ace 


c t~ a 
u uy 


age 




\j ±y 


u.y o 


Asn 




Leu 






Arg 


T.pi i 
J— i C; H 


V d 1 




OCl 


Thr 




Qpr 
o CI 






- 15 










- 10 










-5 








999 




c c c 
u u u 


nt* p* 


add 


d. L. L. 


uyci 


ddy 


d U U 


pnn 

<~yy 


gc c 


dy d 


aa c 
y d 


rra p* 

yd*-. 


pap 

ddu 


P 1 cct 

uuy 




Arg 


Pro 


V d 1 


Lys 


lie 


Arg 




11C 


Arg 


Ala 
l d 


Hi y 


rib 




A c;n 


xrl \J 




_L 








-J 










10 










15 


ggc 


etc 


cga 


ydL 


1 t- t- 

U U U 


gaa 


np p* 
y u. u 


dy 


ttc 


ata 


dyy 


eta 


tta 
u uy 


gac 


aaa 


ata 




Leu 


Arg 


Asp 


-flit: 


r«l 
ulU 


M.ld 


Q £i >- 

o ei 


T)V>p 
ir lie 


lie 


Arg 


Leu 


Leu 


Asp 


Lys 


Tl p 
lie 










9 n 










2 5 










3 0 




acg 


dd U 


99 1 




cga 


^ t" t" 


y del 


f- 

d U d 


aac 


caa 


aca 


gga 


^5 P 1 ^5 

d v— d 


ace 


tta 
u u d 


t a t- 
u d U 


TVit- 


Asn 


vji y 


O CI 


Arg 


Tip 




Tip 
lie 


Asn 


Gin 


Thr 


Gly 


Thr 


Thr 


Leu 


i Y r 


















4 0 










4 5 






+-•=,+- 

LdL 


Lay 


c C t~ 


yyc 


/~< +- f-< 


P 1 1~ PT 


U d U 


yy <- 


PTPT a 

yya 


tct 


9 c y 


y dd 


P* ^5 t~ 

u-d u 


na p> 


tat 
ugu 


age 


Tyr 


<J ±11 


Pro 


biy 


Leu 


Leu 


Tyr 


biy 


biy 


Ser 


V d 1 


m ii 

ulU 


ni s 


Asp 


Cys 


Ser 






r n 










_> ~> 










6 0 








prt~ c 


U L U 


cgu 


yy c 


att 


yyy 


tat 


tac 


LL y 


ctz* ci 

y^y 


agt 


ctt 


ctt 


uy u 


tta 
u uy 


etc t" 

y uu 


vai 


Leu 


Arg 




Tip 

J. le 


vjiy 


Tyr 


Tyr 


Leu 


r-l n 


Q C=l -y~ 
O CI 


Leu 


Leu 


v_y o 


Leu 


AT a 
i-4.1 d 














7 0 










75 










cca 


ttt 


duy 


aag 


cac 


c eg 


tta 


aaa 


ata 


att 

y L u 


eta 


cga 


yy ci 


at a 

y u y 


acc 


aat 


±r i \J 




1*1 e U 


j-i y o 


nib 


ri UJ 


T .^i i 




Tl p 

11C 


Va 1 
v d i 


T.^i l 

IlCU 




vj i y 


Va 1 
v d i 


j. in 


A cj n 


8 0 










8 5 










90 










95 


y d L 


cag 


^th 


na p» 


c c t~ 
^— i— 


tea 


at- 1 
y uu 


aa t 
yd i— 


at t 

y cc 


ctt 


aag 


gca 


aca 


gca 


etc 


cct 


Asp 


Gin 


Tip 

lie 




ri U> 


O CI 


Val 


.rt.0 U 1 


Val 


Leu 


Lys 


Ala 


Thr 


AT a 


Leu 


Pro 










inn 

1UU 










1 ne; 

1 \J z> 










110 






tta 
u uy 


aaa 


caa 


ttt 


yyy 


att 


gat 


aat 

yy L - 


gaa 


tea 


ttt 


gaa 


c t a 
u uy 


aag 


att 


T.^i 1 


Leu 


Ly s 


Gin 


Phe 


Gly 


He 


Asp 


oiy 


Glu 


S er 


Phe 


Glu 


Leu 


Lys 


He 








115 










12 0 










125 






y "-y 


cga 


prrn 


aaa 


atg 


cct 


ccc 


aaa 

yy a 


aaa 

yy 01 


aaa 

yy d 


oar 

yy u 


gaa 


ata 
y L -y 


att 
y u u 


ttc 


tea 


Val 


■ rt - L y 


Arg 


Gly 


Met 


Pro 


Pro 


Gly 


Gly 


Gly 


Glv 

vjiy 


Glu 


Val 


Val 


Phe 


Ser 






130 










135 










14 0 








tat 
•-y 


c c t 


ata 

y "-y 


aaa 


aag 


ate 


tta 
u uy 


aag 


ccc 


att 


caa 


etc 


aca 


aa t 

yd i— 


cca 


aaa 
yy a 


u.y & 


Pro 


Val 


Arg 


Lys 


Val 


Leu 


Lys 


Pro 


He 


Gin 


Leu 


Thr 


Asp 


Pro 


Glv 




145 










150 










155 










aaa 


ate 


aaa 


cat 


att 


aga 


aaa 


atg 


aca 
y *-y 


tac 


tct 


gta 


cat 


ata 
z3 , -y 


tea 


cct 


Ly s 


lie 


Ly s 


Arg 


lie 


Arg 


Gly 


Met 


Ala 


xy i 


Ser 


Val 


A T"a 


Val 


Ser 


Pro 


160 










165 










170 










175 


cag 


atg 


gcg 


aac 


egg 


att 


gtg 


gat 


tct 


gca 


agg 


age 


ate 


etc 


aac 


aag 


Gin 


Met 


Ala 


Asn 


Arg 


He 


Val 


Asp 


Ser 


Ala 


Arg 


Ser 


He 


Leu 


Asn 


Lys 










180 










185 










190 




ttc 


ata 


cct 


gat 


ate 


tat 


att 


tac 


aca 


gat 


cac 


att 


aaa 


gga 


gtc 


aac 


Phe 


He 


Pro 


Asp 


He 


Tyr 


He 


Tyr 


Thr 


Asp 


His 


He 


Lys 


Gly 


val 


Asn 








195 










200 










205 






tct 


999 


aag 


tct 


ccg 


ggc 


ttt 


ggg 


ttg 


tea 


ctg 


gtt 


get 


gag 


acc 


acc 


Ser 


Gly 


Lys 


Ser 


Pro 


Gly 


Phe 


Gly 


Leu 


Ser 


Leu 


Val 


Ala 


Glu 


Thr 


Thr 






210 










215 










220 








agt 


ggc 


acc 


ttc 


etc 


agt 


get 


gaa 


ctg 


gec 


tec 


aac 


CCC 


cag 


ggc 


cag 


Ser 


Gly 


Thr 


Phe 


Leu 


Ser 


Ala 


Glu 


Leu 


Ala 


Ser 


Asn 


Pro 


Gin 


Gly 


Gin 




225 










230 










235 










gga 


gca 


gca 


gta 


ctt 


cca 


gag 


gac 


ctt 


ggc 


agg 


aac 


tgt 


gee 


egg 


ctg 


Gly 


Ala 


Ala 


val 


Leu 


Pro 


Glu 


Asp 


Leu 


Gly 


Arg 


Asn 


Cys 


Ala 


Arg 


Leu 


240 










245 










250 










255 


ctg 


ctg 


gag 


gaa 


ate 


tac 


agg 


ggt 


gga 


tgc 


gta 


gac 


teg 


acc 


aac 


caa 



365 



413 



461 



557 



605 



653 



701 



45 aaa ate aaa cgt att aga gga atg gcg tac tct gta cgt gtg tea cct 749 



797 



845 



893 



941 



60 gga gca gca gta ctt cca gag gac ctt ggc agg aac tgt gec egg ctg 989 



103 7 
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10 



15 



20 



25 



30 



Leu 


Leu 


Glu 


Glu 


He 


Tyr 


Arq 


Gly 


Gly 


Cys 


Val 


Asd 


Ser 


Thr 


Asn 


Gin 










260 










265 










270 




age 


CtQ 


acq 


eta 


eta 


etc 


atq 


ace 


ctt 


qqa 


cag 


cag 


qat 


Qtt 


tec 


aaa 


Ser 


Leu 


Ala 


Leu 


Leu 


Leu 


Met 


Thr 


Leu 


Gly 


Gin 


Gin 


Asp 


Val 


Ser 


Lys 








275 










280 










285 






ate 


Ctq 


eta 


qqc 


cc t 


etc 


tct 


ccc 


tac 


acq 


ata 


gaa 


ttt 


ttq 

ZD 


cqq 


cat 


Val 


Leu 


Leu 


Gly 


Pro 


Leu 


Ser 


Pro 


Tyr 


Thr 


He 


Glu 


Phe 


Leu 


Arg 


His 






290 










295 










300 








tta 


aag 


age 


ttt 


ttc 


caa 


att 


atq 

c z) 


ttt 


aaa 


att 


gaa 


acc 


aag 


cca 


tqt 


Leu 


Lvs 


Ser 


Phe 


Phe 


Gin 


He 


Met 


Phe 


Lys 


He 


Glu 


Thr 


Lys 


Pro 


Cys 




305 










310 










315 










ggt 


gaa 


gaa 


etc 


aag 


ggt 


ggg 


gat 


aaa 


gtg 


ctg 


atg 


acc 


tgt 


gtt 


ggc 


Gly 


Glu 


Glu 


Leu 


Lys 


Gly 


Gly 


Asp 


Lys 


Val 


Leu 


Met 


Thr 


Cys 


Val 


Gly 


320 










325 










330 










335 


att 


ggt 


ttc 


tec 


aac 


Ctt 


age 


agg 


acc 


etc 


aag 


tgataaccat cacaagataa 


He 


Gly 


Phe 


Ser 


Asn 


Leu 


Ser 


Arg 


Thr 


Leu 


Lys 













340 



345 



ggccccagtg cctacagaca aagcagaagc tgccacggac 
atggattaat ccaggacaga atagecaett gcttaatttt 
acaaataaaa gacatccctg tagcatatgg tttccagctg 
ttgcccagga ggggcccagt caccatgaga gctcccttgc 
gccttcaggc cacagtegtg ctgetagaac agtctegtag 
ctcagcctac tatcataggc ttcctcagcc ctctgtcata 
ggagtctgtt actgttcttt ctgeaaggae tcacctcctt 
gggattaaat gagataatat gagtggcagc tcttcatgag 
gtcagaaatt ggtgtattag actatttatc tttgatcttc 
acacggacac ggatcttcat ctggttcatt gtatttatat 
gggctccaag taagttattg ggatgttttt atattccagg 
tattttcaca atagctctgt gatgtaagtg ctatctccat 
ttttgttcat ttgaaatgta taatgtaaag acattaaatc 
aaaaaaaaaa 



accaatggga 
ctgtgaagaa 
tttctccagt 
cttacctgga 
ctgcagttca 
tggctgtttt 
gagccttggt 
tectgeagtg 
tgaatggatt 
gtgagggatg 
tgtgctgtac 
gagaaaattc 
tcctcattta 



ccaagtccaa 
atatcaatat 
ggcattgeca 
ggaagaatgt 
gctgtgcttc 
gcaaacctgt 
ttttgttgta 
ctaagcaaat 
gctgtcatgg 
gatggctgcg 
gttcttattt 
ataaagggtg 
aggaaaaaaa 



1085 



1133 



1181 



1229 



1282 



1342 
1402 
1462 
1522 
1582 
1642 
1702 
1762 
1822 
1882 
1942 
2002 
2062 
2072 



<210> 34 

<211> 409 

35 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
40 <222> 35 . .316 

<220> 

<221> sig_peptide 

<222> 35 . . 109 
45 <223> Von Heijne matrix 

score 5.38058532480537 
seq AVTSLLSPTPATA/LA 



<400> 34 

50 tttttttcga gaeeggaagt gagtgatcga 



ctg agg acc egg aca gee gtt aca 

Leu Arg Thr Arg Thr Ala Val Thr 

55 -15 

aca get ctt get gtc aga tac gca 

Thr Ala Leu Ala Val Arg Tyr Ala 

1 5 

aaa aac etc ggt gga aag tea tea 

60 Lys Asn Leu Gly Gly Lys Ser Ser 

15 20 

atg gaa ggt cac tat gtt cat get 

Met Glu Gly His Tyr Val His Ala 



aagc atg gcg teg gtg gtg ttg gcg 55 
Met Ala Ser Val Val Leu Ala 
-25 -20 
tec ttg eta age ccc act ccg get 103 
Ser Leu Leu Ser Pro Thr Pro Ala 
-10 -5 

tec aag aag teg ggt ggt age tec 151 
Ser Lys Lys Ser Gly Gly Ser Ser 
10 

ggc aga cgc caa ggc att aag aaa 199 
Gly Arg Arg Gin Gly He Lys Lys 

25 30 
ggg aac ate att gca aca cag cgc 247 
Gly Asn He He Ala Thr Gin Arg 
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10 



15 



35 40 45 

cat ttc cgc tgg cac cca ggt gcc cat gtg agt tgc tec gtt get gec 
His Phe Arg Trp His Pro Gly Ala His Val Ser Cys Ser Val Ala Ala 

50 55 60 

ccc ctt ttt cct ttt eta ggt tgacctctcc ttgcccctaa gcatggtaat 
Pro Leu Phe Pro Phe Leu Gly 
65 

aacagttgea tgtattgagt gcttaccaaa tggcaagcat tgtgccaaaa aaaaaaaaaa 
aaa 

<210> 35 
<211> 836 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 



295 



346 



406 
409 



177 . . 767 



20 <220> 

<221> sig_peptide 

<222> 177 . .236 

<223> Von Heijne matrix 

score 6.51720597568932 
25 seq LAVILTLLGLAIL/AI 



<400> 35 

aatctgctcc aegcaattte tcagtgatcc tctgcatctc 
acacccaagt teatattget cagaaacagt gaacttgagt 
30 ctctctgaca aagaaatcca gatgatgega gacctgatga 



35 



50 



aca 
Thr 

gcc 
Ala 



gaa 
40 Glu 



gac 
Asp 
30 
45 gag 
Glu 



tct 
Ser 

tgt 
Cys 



atg 
55 Met 



gtc 
Val 

ate 
He 

atg 

Met 

15 

tat 

Tyr 

aga 
Arg 

cga 
Arg 

tta 
Leu 

caa 
Gin 
95 
caa 
Gin 



ttg gaa 
Leu Glu 

ctg get 
Leu Ala 
1 

tat ate 
Tyr He 

gag gat 
Glu Asp 

tec aaa 
Ser Lys 

tea agt 
Ser Ser 
65 

caa aca 
Gin Thr 
80 

ttc aca 
Phe Thr 



ata 


act 


ttg 


He 


Thr 


Leu 


-15 






att 


ttg 


tta 


He 


Leu 


Leu 


tec 


aga 


tac 


Ser 


Arg 


Tyr 






20 


ggt 


aga 


gga 


Gly 


Arg 


Gly 




35 




aga 


gat 


tac 


Arg 


Asp 


Tyr 


50 






att 


get 


tta 


He 


Ala 


Leu 



get gtc 
Ala Val 

aca aga 
Thr Arg 
5 

agt tea 
Ser Ser 



act gaa 
Thr Glu 

gcc cct 
Ala Pro 



tct caa aaa acc att gtg 
Ser Gin Lys Thr He Val 
110 115 
60 gga tec aat ggg agg ata 
Gly Ser Asn Gly Arg He 
130 

act ggg get aaa gga agg 



gaa 
Glu 

att 
He 
100 
caa 
Gin 

aac 
Asn 

gtc 



tec 
Ser 

aca 
Thr 

cct 
Pro 

cct 
Pro 
85 
CCC 

Pro 



cga 
Arg 

cca 
Pro 

caa 

Gin 

70 

cct 

Pro 



ate ctg 
He Leu 
-10 

tgg gca 
Trp Ala 

gaa caa 
Glu Gin 

cat gca 
His Ala 

40 
tea acc 
Ser Thr 
55 

gga tec 
Gly Ser 

tec aga 
Ser Arg 



gga get aca 
Gly Ala Thr 



act eta gga cct 
Thr Leu Gly Pro 
120 

ata age cag etc 
He Ser Gin Leu 
135 

aca tct ggt cca 



tgcctacaag ggcctccctg 
ttttcatttt accttgatct 
agacaataca tggaaa atg 

Met 
-20 

act eta ctg gga ctt 
Thr Leu Leu Gly Leu 
-5 

cga cgt aag caa agt 
Arg Arg Lys Gin Ser 
10 

agt get aga ctt ctg 
Ser Ala Arg Leu Leu 
25 

tat tea aca caa agt 
Tyr Ser Thr Gin Ser 
45 

aac tct eta gca ctg 
Asn Ser Leu Ala Leu 
60 

atg agt agt ata aaa 
Met Ser Ser He Lys 
75 

act gca gga gcc atg 
Thr Ala Gly Ala Met 
90 

gga cct ate aag etc 
Gly Pro He Lys Leu 
105 

att gta caa tat cct 
He Val Gin Tyr Pro 
125 

acc tea gag gat etc 
Thr Ser Glu Asp Leu 
140 

cag ttc cct aat age 



60 
120 
179 



227 



275 



323 



371 



419 



467 



515 



563 



611 



659 



707 



39 



10 



15 
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Thr Gly Ala Lys Gly Arg Val Thr Ser Gly Pro Gin Phe Pro Asn Ser 

145 150 155 

cac cat gtg cca gag aat eta cat gga tac atg aat tec ctt tec ctt 755 
His His Val Pro Glu Asn Leu His Gly Tyr Met Asn Ser Leu Ser Leu 
; 160 165 170 

ttc tec cct get tgactccctc tcccttatgt gtaaacaatt taaaaatatg 807 
Phe Ser Pro Ala 
175 

atagtgtata aatgaaaaaa aaaaaaaaa 83 6 

<210> 36 
<211> 1323 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 208 . . 1239 



20 <220> 

<221> sig_peptide 

<222> 208 . .294 

<223> Von Heijne matrix 

score 5.73027134157378 
25 seq GLVLICVCSKTHS/LK 



<400> 36 

agtctegtat cgcgcccggg aggegcegga gcccagcggc tggegecaga tccaggctcc 60 
tggaagaacc atgtccggca gctactggtc atgecaggea cacactgctg cccaagagga 12 0 
30 gctgctgttt gaattatctg tgaatgttgg gaagaggaat gecagagctg ccggctgaaa 180 
attacccaac caagagaaat ctgeagg atg gac ttt ctg gtc etc ttc ttg ttc 234 

Met Asp Phe Leu Val Leu Phe Leu Phe 
-25 





tac 


ctg 


get 


teg 


gtg 


ctg 


atg 


ggt 


ctt 


gtt 


ctt 


ate 


tgc 


gtc 


tgc 


teg 


282 


35 


Tyr 


Leu 


Ala 


Ser 


Val 


Leu 


Met 


Gly 


Leu 


Val 


Leu 


He 


Cys 


Val 


Cys 


Ser 






-20 










-15 










-10 










-5 






aaa 


acc 


cat 


age 


ttg 


aaa 


ggc 


ctg 


gee 


agg 


gga 


gga 


gca 


cag 


ata 


ttt 


330 
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tgg tgt gtg cac cgt ttc gac cat cac tgt gtt tgg gtg aac aac tgc 762 

Trp Cys Val His Arg Phe Asp His His Cys Val Trp Val Asn Asn Cys 

145 150 155 

ate ggg gec tgg aac ate agg tac ttc etc ate tac gtc ttg acc ttg 810 

5 He Gly Ala Trp Asn lie Arg Tyr Phe Leu lie Tyr Val Leu Thr Leu 

160 165 170 

acg gec teg get gee acc gtc gec att gtg age acc act ttt ctg gtc 858 

Thr Ala Ser Ala Ala Thr Val Ala lie Val Ser Thr Thr Phe Leu Val 

175 180 185 

10 cac ttg gtg gtg atg tea gat tta tac cag gag act tac ate gat gac 906 

His Leu Val Val Met Ser Asp Leu Tyr Gin Glu Thr Tyr lie Asp Asp 

190 195 200 

ctt gga cac etc cat gtt atg gac acg gtc ttt ctt att cag tac ctg 954 

Leu Gly His Leu His Val Met Asp Thr Val Phe Leu lie Gin Tyr Leu 

15 205 210 215 220 

ttc ctg act ttt cca egg att gtc ttc atg ctg ggc ttt gtc gtg gtt 1002 

Phe Leu Thr Phe Pro Arg lie Val Phe Met Leu Gly Phe Val Val Val 

225 230 235 

ctg age ttc etc ctg ggt ggc tac ctg ttg ttt gtc ctg tat ctg gcg 1050 

20 Leu Ser Phe Leu Leu Gly Gly Tyr Leu Leu Phe Val Leu Tyr Leu Ala 

240 245 250 

gee acc aac cag act act aac gag tgg tac aga ggt gac tgg gec tgg 1098 

Ala Thr Asn Gin Thr Thr Asn Glu Trp Tyr Arg Gly Asp Trp Ala Trp 

255 260 265 

25 tgc cag cgt tgt ccc ctt gtg gec tgg cct ccg tea gca gag ccc caa 1146 

Cys Gin Arg Cys Pro Leu Val Ala Trp Pro Pro Ser Ala Glu Pro Gin 

270 275 280 

gtc cac egg aac att cac tec cat ggg ctt egg age aac ctt caa gag 1194 

Val His Arg Asn lie His Ser His Gly Leu Arg Ser Asn Leu Gin Glu 

30 285 290 295 300 

ate ttt eta cct gee ttt cca tgt cat gag agg aag aaa caa gaa 1239 

lie Phe Leu Pro Ala Phe Pro Cys His Glu Arg Lys Lys Gin Glu 

305 310 315 

tgacaagtgt atgactgect ttgagctgta gttcccgttt atttacacat gtggatcctc 1299 

35 gttttccaaa aaaaaaaaaa aaaa 1323 

<210> 37 

<211> 1945 

<212> DNA 

40 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 60 . . 1682 

45 

<220> 

<221> sig_peptide 
<222> 60 . . 143 
<223> Von Heijne matrix 
50 score 3.75144398608723 

seq SGLLLQVLFRLIT/FV 

<400> 37 

ategegacta aacggagtgg eggeggcatt tcctggtgtc tgagcctggc geggagget 59 

55 atg ggc age cag gag gtg ctg ggc cac gcg gec egg ctg tec tec tec 107 

Met Gly Ser Gin Glu Val Leu Gly His Ala Ala Arg Leu Ser Ser Ser 

-25 -20 -15 

ggt etc etc ctg cag gtg ttg ttt egg ttg ate acc ttt gtc ttg aat 155 

Gly Leu Leu Leu Gin Val Leu Phe Arg Leu lie Thr Phe Val Leu Asn 
60 -10 -5 1 

gca ttt att ctt cgc ttc ctg tea aag gaa ate gtt ggc gta gta aat 2 03 

Ala Phe lie Leu Arg Phe Leu Ser Lys Glu lie Val Gly Val Val Asn 

5 10 15 20 
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gta aga eta acg ctg ctt tac tea acc acc etc ttc ctg gec aga gag 2 51 

Val Arg Leu Thr Leu Leu Tyr Ser Thr Thr Leu Phe Leu Ala Arg Glu 

25 30 35 

gec ttc cgc aga gca tgt etc agt ggg ggc acc cag cga gac tgg age 299 

5 Ala Phe Arg Arg Ala Cys Leu Ser Gly Gly Thr Gin Arg Asp Trp Ser 

40 45 50 

cag acc etc aac ctg ctg tgg eta aca gtc ccc ctg ggt gtg ttt tgg 347 

Gin Thr Leu Asn Leu Leu Trp Leu Thr Val Pro Leu Gly Val Phe Trp 

55 60 65 

10 tec tta ttc ctg ggc tgg ate tgg ttg cag ctg ctt gaa gtg cct gat 395 

Ser Leu Phe Leu Gly Trp lie Trp Leu Gin Leu Leu Glu Val Pro Asp 

70 75 80 

cct aat gtt gtc cct cac tat gca act gga gtg gtg ctg ttt ggt etc 443 

Pro Asn Val Val Pro His Tyr Ala Thr Gly Val Val Leu Phe Gly Leu 

15 85 90 95 100 

teg gca gtg gtg gag ctt eta gga gag ccc ttt tgg gtc ttg gca caa 491 

Ser Ala Val Val Glu Leu Leu Gly Glu Pro Phe Trp Val Leu Ala Gin 

105 110 115 

gca cat atg ttt gtg aag etc aag gtg att gca gag age ctg teg gta 539 

20 Ala His Met Phe Val Lys Leu Lys Val lie Ala Glu Ser Leu Ser Val 

120 125 130 

att ctt aag acc gtt ctg aca get ttt etc gtg ctg tgg ttg cct cac 587 

lie Leu Lys Thr Val Leu Thr Ala Phe Leu Val Leu Trp Leu Pro His 

135 140 145 

25 tgg gga ttg tac att ttc tct ttg gec cag ctt ttc tat acc aca gtt 635 

Trp Gly Leu Tyr lie Phe Ser Leu Ala Gin Leu Phe Tyr Thr Thr Val 

150 155 160 

ctg gtg etc tgc tat gtt att tat ttc aca aag tta ctg ggt tec cca 683 

Leu Val Leu Cys Tyr Val lie Tyr Phe Thr Lys Leu Leu Gly Ser Pro 

30 165 170 175 180 

gaa tea acc aag ctt caa act ctt cct gtc tec aga ata aca gat ctg 731 

Glu Ser Thr Lys Leu Gin Thr Leu Pro Val Ser Arg lie Thr Asp Leu 

185 190 195 

tta ccc aat att aca aga aat gga gcg ttt ata aac tgg aaa gag get 779 

35 Leu Pro Asn lie Thr Arg Asn Gly Ala Phe lie Asn Trp Lys Glu Ala 

200 205 210 

aaa ctg act tgg agt ttt ttc aaa cag tct ttc ttg aaa cag att ttg 827 

Lys Leu Thr Trp Ser Phe Phe Lys Gin Ser Phe Leu Lys Gin lie Leu 

215 220 225 

40 aca gaa ggc gag cga tat gtg atg aca ttt ttg aat gta ttg aac ttt 875 

Thr Glu Gly Glu Arg Tyr Val Met Thr Phe Leu Asn Val Leu Asn Phe 

230 235 240 

ggt gat cag ggt gtg tat gat ata gtg aat aat ctt ggc tec ctt gtg 923 

Gly Asp Gin Gly Val Tyr Asp lie Val Asn Asn Leu Gly Ser Leu Val 

45 245 250 255 260 

gec aga tta att ttc cag cca ata gag gaa agt ttt tat ata ttt ttt 971 

Ala Arg Leu lie Phe Gin Pro lie Glu Glu Ser Phe Tyr lie Phe Phe 

265 270 275 

get aag gtg ctg gag agg gga aag gat gec aca ctt cag aag cag gag 1019 

50 Ala Lys Val Leu Glu Arg Gly Lys Asp Ala Thr Leu Gin Lys Gin Glu 

280 285 290 

gac gtt get gtg get get gca gtc ttg gag tec ctg etc aag ctg gec 1067 

Asp Val Ala Val Ala Ala Ala Val Leu Glu Ser Leu Leu Lys Leu Ala 

295 300 305 

55 ctg ctg gec ggc ctg acc ate act gtt ttt ggc ttt gec tat tct cag 1115 

Leu Leu Ala Gly Leu Thr lie Thr Val Phe Gly Phe Ala Tyr Ser Gin 

310 315 320 

ctg get ctg gat ate tac gga ggg acc atg ctt age tea gga tec ggt 1163 

Leu Ala Leu Asp lie Tyr Gly Gly Thr Met Leu Ser Ser Gly Ser Gly 

60 325 330 335 340 

cct gtt ttg ctg cgt tec tac tgt etc tat gtt etc ctg ctt gee ate 1211 

Pro Val Leu Leu Arg Ser Tyr Cys Leu Tyr Val Leu Leu Leu Ala lie 

345 350 355 
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aat gga gtg aca gag tgt ttc aca ttt get gec atg age aaa gag gag 1259 
Asn Gly Val Thr Glu Cys Phe Thr Phe Ala Ala Met Ser Lys Glu Glu 

360 365 370 

gtc gac agg tac aat ttt gtg atg ctg gec ctg tec tec tea ttc ctg 1307 
5 Val Asp Arg Tyr Asn Phe Val Met Leu Ala Leu Ser Ser Ser Phe Leu 
375 380 385 

gtg tta tec tat etc ttg ace cgt tgg tgt ggc age gtg ggc ttc ate 1355 
Val Leu Ser Tyr Leu Leu Thr Arg Trp Cys Gly Ser Val Gly Phe lie 
390 395 400 

10 ttg gee aac tgc ttt aac atg ggc att egg ate acg cag age ctt tgc 1403 
Leu Ala Asn Cys Phe Asn Met Gly lie Arg lie Thr Gin Ser Leu Cys 
405 410 415 420 

ttc ate cac cgc tac tac cga agg age ccc cac agg ccc ctg get ggc 1451 
Phe lie His Arg Tyr Tyr Arg Arg Ser Pro His Arg Pro Leu Ala Gly 

15 425 430 435 

ctg cac eta teg cca gtc ctg etc ggg aca ttt gee etc agt ggt ggg 1499 
Leu His Leu Ser Pro Val Leu Leu Gly Thr Phe Ala Leu Ser Gly Gly 

440 445 450 

gtt act get gtt teg gag gta ttc etc tgc tgt gag cag ggc tgg cca 1547 

20 Val Thr Ala Val Ser Glu Val Phe Leu Cys Cys Glu Gin Gly Trp Pro 
455 460 465 

gee aga ctg gca cac att get gtg ggg gec ttc tgt ctg gga gca act 1595 
Ala Arg Leu Ala His lie Ala Val Gly Ala Phe Cys Leu Gly Ala Thr 
470 475 480 

25 etc ggg aca gca ttc etc aca gag ace aag ctg ate cat ttc etc agg 1643 
Leu Gly Thr Ala Phe Leu Thr Glu Thr Lys Leu lie His Phe Leu Arg 
485 490 495 500 

act cag tta ggt gtg ccc aga cgc act gac aaa atg aca tgacttcagg 1692 
Thr Gin Leu Gly Val Pro Arg Arg Thr Asp Lys Met Thr 

30 505 510 

gaagcctgga cacccgaggc acctggacca gctatgggta gttctgtggg tggaacacat 1752 
tctgtgtaag agccccactg agggctctgc agcggagtga cagcaacccc agagatgagg 1812 
caccagagag tgccactgca tgagacacct gtgaccattc gaagtctgaa atgcgggggg 1872 
ggagtttcat ttttaagtga agaccaaaag ccctttaaaa ataatagttt tttatcaaaa 1932 

35 aaaaaaaaaa aaa 1945 



<210> 38 

<211> 1330 

<212> DNA 

40 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 198 . . 998 

45 

<220> 

<221> sig_peptide 
<222> 198 . .269 
<223> Von Heijne matrix 
50 score 9.08017839002281 

seq LLLGPGLLATVRA/EC 



<400> 38 

agaaatcagc ectttgeaga gggegcagag ggcctggaaa cctctgggac cttttcccag 60 
55 gaactgttta tggtttcccc ctaggtctag gagaegtaga tgcataggtg gattggatac 12 0 
atcgatggta gctataagag tcgtgtctga acccggcttt tccaattggc ctgctccatc 180 
egaacagegt caactcc atg gcg egg ttc ctg aca ctt tgc act tgg ctg 230 

Met Ala Arg Phe Leu Thr Leu Cys Thr Trp Leu 
-20 -15 
60 ctg ttg etc ggc ccc ggg etc ctg gcg acc gtg egg gec gaa tgc age 278 
Leu Leu Leu Gly Pro Gly Leu Leu Ala Thr Val Arg Ala Glu Cys Ser 

-10 -5 1 

cag gat tgc gcg acg tgc age tac cgc eta gtg cgc ccg gec gac ate 326 
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180 
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200 



205 



cgc ttt gec gag get ctg ccc tec gac gaa gaa 
Arg Phe Ala Glu Ala Leu Pro Ser Asp Glu Glu 

215 220 
aaa gaa gtt cct gaa atg gaa aaa aga tac gga 
Lys Glu Val Pro Glu Met Glu Lys Arg Tyr Gly 

230 235 
taatattttt cccactagtg gccccaggcc ccagcaagcc 
aaactgttga tggtgtttta ttgtcatgtg ttgettgect 
ctggataact atacaacctg aaaactgtca tttcaggttc 
taagctcagt attagtctat tgeagctate tegttttcat 
cttgtctctt atttttgaca aacatcaata aatgettact 
ctattacccc aagtgcaaaa aaaaaaaaaa aa 



210 
tac 



tec 



ggc gaa agt 
Gly Glu Ser Tyr Ser 
225 

gga ttt atg aga ttt 
Gly Phe Met Arg Phe 
240 

tccctccatc ctccagtggg 
tgtatagttg acttcattgt 
tgtgctcttt ttggagtctt 
gctaaaatag tttttgttat 
tgtatataga gataataaac 



374 



422 



470 



518 



566 



614 



662 



710 



758 



806 



854 



902 



950 



998 



1058 
1118 
1178 
1238 
1298 
1330 



<210> 39 

<211> 2124 

<212> DNA 

55 <213> Homo sapiens 



60 



<220> 
<221> CDS 
<222> 505. 



1590 



<220> 

<221> sig_peptide 
<222> 505. .624 
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<223> Von.Heijne matrix 

score 8.5056444915604 
seq WMLMLLTLLVLG/MV 

5 <400> 39 

cctggcataa ctgataggca tgtatgggag gaccacattc ctggggacag cctgggtatg 60 

tgacatggca ggtgaccagg ttcccatgaa tgcccgaggc tgtgcccatc ccatgagctg 12 0 

gggcttccct ggaggtaaag agctagggtg gggtggcagt gggtagaacc ccagctggac 18 0 

agctccttcc ttagctctgt gattgctaca gctggttctg gaagccacag gcgccctcag 240 

10 gacaaatggg gcttcttcag cacagggtag tgagtgctga gctaagcaag gacactgtcc 3 00 

ccttctctgc ccaggctcga gctgtgcacc tttaccctgg caattgccct gggtgctgtc 360 

ctgctcctgc ccttctccat catcagcaat gaggtgctgc tctccctgcc tcggaactac 420 

tacatccagt ggctcaacgg ctccctcatc catggcctct ggaaccttgt ttttctcttc 480 

tccaacctgt ccctcatctt cctc atg ccc ttt gca tat ttc ttc act gag 531 

15 Met Pro Phe Ala Tyr Phe Phe Thr Glu 

-40 -35 

579 



627 



25 



30 



35 ctg ctg gaa gac ctg gag gag cag ctg tac tgc tea gec ttt gag gag 867 
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